Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The Y Programming Language

by H1ghBre4k3r.

This text is meant as a basic guide to the Y Programming Language. I want to introduce you to the foundations and concepts of the language, as well as give you some orientation on what to build with it.

If you happen to find and issues within this documentation, feel free to open an issue in the official repository.

Pre-Requisites

At the current stage, the language only exists as a "theoretical concept", since the only stable (and somewhat working) components are the lexer and the parser. Due to that, the only pre-requisites for "using" the language are a working machine, terminal access with Rust and Git installed and available. First, clone the repository:

git clone https://github.com/H1ghBre4k3r/y-lang.git
cd y-lang

After that, you can just build and use the executable:

cargo build --release
./target/release/y-lang <FILE_NAME>

This will print you the lexed tokens, as well as the parse AST.

Language Basics

Y follows the foundations of many other programming languages. This chapter tries to make you familiar with these and provide you with some examples.

Variables

Variables are fundamental building blocks in Y programs. They store values during program execution. Y emphasizes immutability by default, requiring explicit declaration when mutability is needed.

Variable Declaration

The basic syntax for declaring a variable uses the let keyword:

let x = 12;
let a = baz(x);
let test_char = 'a';
let test_str = "test";

A variable declaration consists of:

  • The let keyword
  • An identifier (variable name)
  • The = assignment operator
  • An expression that provides the initial value

Type Annotations

Y can infer types automatically, but you can also explicitly specify the type:

let foo: u32 = 42;
let x: (i64) -> i64 = \(x) => x;  // Function type
let arr: &[i64] = &[];            // Array reference type

Mutability

Variables are immutable by default. Once assigned, their value cannot be changed:

let foo = 42;
foo = 1337; // Error: cannot assign to immutable variable

To create a mutable variable, use the mut keyword:

let mut foo = 42;
foo = 1337;  // Valid: foo is mutable

let mut i = 0;
while (i < 10) {
    i = i + 1;  // Mutating i in a loop
}

Practical Examples

From real Y programs:

fn main(): i64 {
    // Immutable variables
    let x = 12;
    let a = baz(x);

    // Mutable array
    let mut arr = &[42, 1337];
    arr[0] = 100;  // Modifying array contents

    // Mutable struct
    let mut my_struct = TestStruct {
        x: 42,
        bar: add
    };
    my_struct.x = 100;  // Modifying struct field

    return x + a;
}

Best Practices

  • Prefer immutable variables when possible - they prevent accidental mutations and make code easier to reason about
  • Use mut only when you actually need to modify the variable's value
  • Choose descriptive variable names that clearly indicate their purpose

Data Types

Y is a statically-typed language where every value has a type. The type system includes built-in primitive types and user-defined types like structs.

Primitive Types

Numeric Types

Y supports several numeric types with explicit bit-width specifications:

Integer Types:

  • i64 - 64-bit signed integer (most common)
  • u32 - 32-bit unsigned integer

Floating Point Types:

  • f64 - 64-bit floating point number
let age: i64 = 25;
let count: u32 = 42;
let pi: f64 = 3.1415;
let price = 133.7;  // Type inferred as f64

Character and String Types

  • char - Single Unicode character
  • str - String literal (immutable string slice)
let letter = 'a';
let greeting = "Hello, World!";
let test_char = 'b';
let test_str = "test";

Boolean Type

  • bool - Boolean values (true or false)
let is_ready = true;
let is_finished = false;

Function Types

Y treats functions as first-class values with explicit type signatures:

// Function type: takes two i64 parameters, returns i64
let add_func: (i64, i64) -> i64 = add;

// Lambda with function type
let identity: (i64) -> i64 = \(x) => x;

// Function taking a function as parameter
fn takes_function(func: (i64, i64) -> i64): i64 {
    func(42, 69)
}

Array Types

Arrays are reference types denoted with &[T]:

let numbers: &[i64] = &[1, 2, 3, 4, 5];
let empty_array: &[i64] = &[];
let chars = &['a', 'b', 'c'];  // Type: &[char]

User-Defined Types

You can define custom types using structs:

struct Person {
    name: str;
    age: i64;
}

let person = Person {
    name: "Alice",
    age: 30
};

Type Inference

Y can automatically infer types in many cases:

let x = 42;        // Inferred as i64
let y = 3.14;      // Inferred as f64
let name = "Bob";  // Inferred as str
let flag = true;   // Inferred as bool

Explicit Type Annotations

When type inference isn't sufficient or for clarity, you can specify types explicitly:

let foo: u32 = 42;
let process: (str) -> void = printf;
let numbers: &[i64] = &[];

Constants

Constants are immutable values known at compile time:

const PI: f64 = 3.1415;
const MAX_SIZE: i64 = 1000;

Type Compatibility

Y has strict type checking. Different numeric types don't automatically convert:

let x: i64 = 42;
let y: u32 = 100;
// let z = x + y;  // Error: type mismatch

Operators

Y provides a comprehensive set of operators for various operations including arithmetic, comparison, and assignment.

Arithmetic Operators

Y supports standard arithmetic operations on numeric types:

let a = 10;
let b = 3;

let sum = a + b;        // Addition: 13
let difference = a - b; // Subtraction: 7
let product = a * b;    // Multiplication: 30
let quotient = a / b;   // Division: 3 (integer division)

// Works with floating point too
let x = 10.5;
let y = 2.0;
let result = x / y;     // 5.25

Comparison Operators

All comparison operators return boolean values:

let x = 42;
let y = 69;

let equal = x == y;           // false
let not_equal = x != y;       // true
let less_than = x < y;        // true
let greater_than = x > y;     // false
let less_equal = x <= y;      // true
let greater_equal = x >= y;   // false

Assignment Operators

The basic assignment operator is =:

let mut counter = 0;
counter = 5;        // Simple assignment

let mut arr = &[1, 2, 3];
arr[0] = 100;       // Array element assignment

let mut person = Person { name: "Alice", age: 25 };
person.age = 26;    // Struct field assignment

Operator Precedence

Operators have the following precedence (highest to lowest):

  1. Postfix operators (function calls, array indexing, property access)
  2. Prefix operators (unary minus, etc.)
  3. Multiplication and Division (*, /)
  4. Addition and Subtraction (+, -)
  5. Comparison operators (==, !=, <, >, <=, >=)
let result = 2 + 3 * 4;     // 14, not 20 (multiplication first)
let comparison = 5 < 3 + 4; // true (addition first, then comparison)

Using Operators with Different Types

Numeric Operations

let int_result = 42 + 17;        // i64 + i64
let float_result = 3.14 + 2.86;  // f64 + f64

String and Character Operations

Currently, Y doesn't support string concatenation with +, but you can work with individual characters:

let ch1 = 'a';
let ch2 = 'b';
let text = "Hello";

Boolean Operations

let flag1 = true;
let flag2 = false;
let x = 10;

// Using comparison results
let result = x > 5;  // true
if (result) {
    // do something
}

Practical Examples

Mathematical Calculations

fn calculate_area(radius: f64): f64 {
    const PI: f64 = 3.1415;
    return PI * radius * radius;
}

fn baz(x: i64): i64 {
    let intermediate = x * 2;
    return intermediate;
}

Conditional Logic

fn main(): i64 {
    let x = 12;
    let y = 24;

    if (x < y) {
        return x + y;
    } else {
        return x - y;
    }
}

Loop Counters

fn count_to_ten(): void {
    let mut i = 0;
    while (i < 10) {
        i = i + 1;  // Using arithmetic and assignment
    }
}

Operator Overloading

Y supports operator overloading through instance methods (though this is an advanced feature):

instance i64 {
    declare add(i64): i64;  // Custom addition behavior
}

Comments

Comments are used to add explanatory notes to your code that are ignored by the compiler. They help make code more readable and maintainable.

Line Comments

Y supports single-line comments using //. Everything after // on that line is treated as a comment:

// This is a comment
let x = 42;  // This is also a comment

fn main(): i64 {
    // Calculate the result
    let a = 10;
    let b = 20;
    return a + b;  // Return the sum
}

Comment Best Practices

Explaining Intent

Use comments to explain why something is done, not just what is done:

// Good: Explains the reasoning
let buffer_size = 1024;  // Use power of 2 for optimal memory alignment

// Less helpful: Just restates the code
let x = 42;  // Set x to 42

Documenting Complex Logic

fn complex_calculation(input: i64): i64 {
    // Apply custom business rule: multiply by 3, add offset
    let intermediate = input * 3;
    let offset = 10;

    // Ensure result stays within acceptable range
    if (intermediate + offset > 1000) {
        return 1000;
    } else {
        return intermediate + offset;
    }
}

Temporary Debugging

Comments can be used to temporarily disable code during development:

fn debug_function(): void {
    let x = calculate_value();
    // let y = expensive_operation();  // Temporarily disabled

    process_result(x);
}

Comments in Examples

Looking at real Y code from the examples:

struct TestStruct {
    x: i64;
    bar: (i64, i64) -> i64;
}

fn main(): i64 {
    let a = add(42, 1337);

    // Create mutable array with initial values
    let mut arr = &[42, 1337];

    // Access array elements
    let b = explicit_return_add(arr[0], arr2[3]);

    // Initialize struct with function reference
    let my_struct = TestStruct {
        x: 42,
        bar: add  // Function as struct field
    };

    return 0;
}

Multi-line Explanations

For longer explanations, use multiple single-line comments:

// This function implements a custom sorting algorithm
// optimized for small arrays (< 10 elements).
// For larger arrays, consider using a different approach.
fn small_array_sort(arr: &[i64]): &[i64] {
    // Implementation here...
    return arr;
}

Comment Style Guidelines

  • Use clear, concise language
  • Keep comments up-to-date with code changes
  • Avoid obvious comments that just restate the code
  • Use proper grammar and punctuation
  • Be consistent with comment style throughout your codebase

What Not to Comment

Avoid commenting obvious code:

// Bad examples:
let x = 42;        // Assign 42 to x
i = i + 1;         // Increment i
return result;     // Return the result

// Good examples:
let timeout = 30;  // Connection timeout in seconds
i = i + 1;         // Move to next element
return result;     // Early return to avoid expensive calculation

Control Structures

Y provides some structures to control the logic of your program.

If-Else

If you want to conditionally execute certain parts of your program, you can utilise the common if-else structure:

if (foo) {
    // do some stuff
} else {
    // do other stuff
}

The expression right after the if needs to evaluate to a boolean value. More on expressions will be discussed in a later chapter. For now, you can imagine simple comparison operations:

if (bar == 42) {
    // ...
}

The first block will be executed if the expression evaluates to true. Similarly, the second block will be executed if the expression evaluates to false.

Generally, the else-block is not required, whereas the first block is required (although it can be empty).

Loops

To repeatedly execute a block of code, Y provides you with loop structures.

While Loops

To execute a block of code while a certain expression evaluates to true, you can use the while loop:

while (foo) {
    // do something
}

Again, foo has to evaluate to a boolean value. It will be evaluated upon each run of the loop.

Expressions

Y is an expression-oriented language, meaning that most constructs evaluate to a value. Understanding expressions is fundamental to writing effective Y code.

What is an Expression?

An expression is a piece of code that evaluates to a value. In Y, almost everything is an expression, including:

  • Literals (numbers, strings, booleans)
  • Variable references
  • Function calls
  • Arithmetic operations
  • Conditionals (if-else)
  • Blocks
  • Lambdas

Basic Expressions

Literal Expressions

42          // Integer literal
3.14        // Floating point literal
"hello"     // String literal
'a'         // Character literal
true        // Boolean literal
false       // Boolean literal

Variable Expressions

let x = 42;
let y = x;  // x is an expression that evaluates to 42

Arithmetic Expressions

let a = 10;
let b = 5;

let sum = a + b;        // Addition expression
let product = a * 2;    // Multiplication expression
let complex = (a + b) * 2 - 1;  // Complex arithmetic expression

Function Call Expressions

Function calls are expressions that evaluate to the return value:

fn add(x: i64, y: i64): i64 {
    x + y  // This is also an expression (the last one in the function)
}

let result = add(10, 20);  // Function call expression
let nested = add(add(1, 2), add(3, 4));  // Nested function calls

Conditional Expressions

If-else constructs are expressions in Y:

let max = if (a > b) {
    a
} else {
    b
};

// Can be used directly in other expressions
let result = if (x > 0) { x } else { -x } + 10;

Block Expressions

Blocks evaluate to the value of their last expression:

let result = {
    let x = 10;
    let y = 20;
    x + y  // This value becomes the result of the block
};  // result is 30

Lambda Expressions

Lambdas are expressions that create anonymous functions:

let add_one = \(x) => x + 1;
let multiply = \(x, y) => x * y;

// Using lambda expressions directly
let numbers = &[1, 2, 3];
let transformed = map(numbers, \(x) => x * 2);

Array Expressions

Array literals are expressions:

let numbers = &[1, 2, 3, 4, 5];
let empty = &[];
let mixed = &[add(1, 2), multiply(3, 4), 42];

Struct Initialization Expressions

Creating structs is also an expression:

struct Point {
    x: i64;
    y: i64;
}

let origin = Point { x: 0, y: 0 };
let point = Point {
    x: calculate_x(),
    y: calculate_y()
};

Property Access Expressions

Accessing struct fields and calling methods:

let person = Person { name: "Alice", age: 25 };
let name = person.name;        // Property access expression
let id = person.get_id();      // Method call expression

Practical Examples

Expression-Heavy Function

fn calculate_distance(x1: f64, y1: f64, x2: f64, y2: f64): f64 {
    // Everything here is composed of expressions
    let dx = x2 - x1;
    let dy = y2 - y1;
    sqrt(dx * dx + dy * dy)  // Final expression becomes return value
}

Chaining Expressions

fn main(): i64 {
    let my_struct = TestStruct {
        x: 42,
        bar: add
    };

    // Chaining property access and function call expressions
    let result = my_struct.bar(10, 20);

    // Complex expression with multiple parts
    let final_result = if (result > 0) {
        result + my_struct.x
    } else {
        my_struct.x * 2
    };

    return final_result;
}

Expression Composition

fn process_data(): i64 {
    let data = &[1, 2, 3, 4, 5];

    // Composed expression using array access, function calls, and arithmetic
    let result = process_value(data[0]) +
                 process_value(data[1]) * 2 +
                 if (data[2] > 3) { data[2] } else { 0 };

    return result;
}

Statement vs Expression

While most things in Y are expressions, some constructs are statements:

// Statements (don't evaluate to values):
let x = 42;              // Variable declaration
x = 100;                 // Assignment
return x;                // Return statement

// Expressions (evaluate to values):
x + y                    // Arithmetic
if (x > 0) { x } else { -x }  // Conditional
{                        // Block
    let temp = x + 1;
    temp * 2
}

Best Practices

  1. Leverage expression-oriented style: Use the fact that if-else and blocks are expressions
  2. Keep expressions readable: Break complex expressions into intermediate variables when needed
  3. Use the last expression in functions: Instead of explicit return, let the last expression be the return value
  4. Compose expressions thoughtfully: Balance conciseness with clarity
// Good: Clear and expressive
fn clamp(value: i64, min: i64, max: i64): i64 {
    if (value < min) {
        min
    } else if (value > max) {
        max
    } else {
        value
    }
}

// Also good: Breaking down complex logic
fn complex_calculation(input: i64): i64 {
    let base = input * 2;
    let adjusted = base + 10;
    if (adjusted > 100) { 100 } else { adjusted }
}

Data Structures

Y provides several built-in data structures for organizing and storing data. This section covers the fundamental data structures available in the language.

Overview

Y supports the following primary data structures:

  • Arrays - Ordered collections of elements of the same type
  • Structs - Custom data types that group related fields together

Arrays

Arrays in Y are reference types that store multiple values of the same type in an ordered sequence. They use the &[T] syntax where T is the element type.

let numbers = &[1, 2, 3, 4, 5];
let chars = &['a', 'b', 'c'];
let empty: &[i64] = &[];

Arrays support indexing for accessing and modifying elements:

let mut arr = &[10, 20, 30];
let first = arr[0];  // Access: 10
arr[1] = 99;         // Modify: [10, 99, 30]

Structs

Structs allow you to create custom data types by grouping related fields:

struct Person {
    name: str;
    age: i64;
}

let person = Person {
    name: "Alice",
    age: 30
};

Structs support:

  • Field access via dot notation
  • Mutable field modification
  • Nesting of other structs
  • Methods through instance blocks

Choosing the Right Data Structure

  • Use arrays when you need an ordered collection of the same type of data
  • Use structs when you need to group different types of data that belong together
  • Combine both for complex data modeling (arrays of structs, structs with array fields)

Example: Combining Data Structures

struct Student {
    name: str;
    grades: &[i64];
}

let students = &[
    Student {
        name: "Alice",
        grades: &[95, 87, 92]
    },
    Student {
        name: "Bob",
        grades: &[88, 79, 94]
    }
];

The following pages provide detailed information about each data structure type.

Arrays

Arrays in Y are reference types that store multiple values of the same type in an ordered sequence. They provide efficient access to elements by index.

Array Syntax

Arrays use the &[T] type syntax, where T is the element type:

let numbers: &[i64] = &[1, 2, 3, 4, 5];
let characters: &[char] = &['a', 'b', 'c'];
let booleans: &[bool] = &[true, false, true];

Creating Arrays

Array Literals

The most common way to create arrays is using array literal syntax:

let fruits = &["apple", "banana", "orange"];
let primes = &[2, 3, 5, 7, 11];
let mixed_numbers = &[42, 1337, 0];

Empty Arrays

Empty arrays require explicit type annotation:

let empty_numbers: &[i64] = &[];
let empty_strings: &[str] = &[];

Arrays from Expressions

Array elements can be any expression:

fn get_value(): i64 { 42 }

let computed = &[
    get_value(),
    10 + 20,
    if (true) { 100 } else { 0 }
];

Accessing Array Elements

Use square bracket notation with zero-based indexing:

let numbers = &[10, 20, 30, 40, 50];

let first = numbers[0];   // 10
let third = numbers[2];   // 30
let last = numbers[4];    // 50

Modifying Arrays

Arrays can be mutable, allowing you to change element values:

let mut scores = &[85, 92, 78];

scores[0] = 95;     // Change first element
scores[2] = 88;     // Change third element
// scores is now &[95, 92, 88]

Note: You can only modify elements, not add or remove them (arrays have fixed size).

Array Examples from Y Programs

Basic Array Operations

fn main(): void {
    let mut arr = &[42, 1337];
    let arr2 = &[1337, 5];

    // Access elements
    let first = arr[0];
    let value = arr2[3];  // Note: This might be out of bounds

    // Modify elements
    arr[0] = 100;
    arr[1] = 200;
}

Arrays with Different Types

fn working_with_arrays(): void {
    // Character arrays
    let mut char_array = &['a', 'b'];
    char_array[1] = 'z';

    // Mixed content (all same type)
    let test_char = 'a';
    let mut foo = &[test_char, 'b'];
    foo[1] = test_char;
}

Arrays in Structs

struct Container {
    data: &[i64];
    size: i64;
}

fn create_container(): Container {
    Container {
        data: &[1, 2, 3, 4, 5],
        size: 5
    }
}

Array Type Compatibility

Arrays are strictly typed - all elements must be the same type:

// Valid:
let numbers = &[1, 2, 3];           // All i64
let words = &["hello", "world"];    // All str

// Invalid:
// let mixed = &[1, "hello", true]; // Error: mixed types

Working with Array References

Arrays in Y are reference types, meaning they refer to data stored elsewhere:

let original = &[1, 2, 3];
let reference = original;  // Both refer to the same array data

let mut mutable_ref = &[4, 5, 6];
modify_array(mutable_ref);  // Function can modify the array

Practical Examples

Processing Array Data

fn sum_array(arr: &[i64]): i64 {
    let mut total = 0;
    let mut i = 0;

    // Manual iteration (Y doesn't have for loops yet)
    while (i < array_length(arr)) {
        total = total + arr[i];
        i = i + 1;
    }

    return total;
}

Array as Function Parameter

fn process_scores(scores: &[i64]): i64 {
    let first_score = scores[0];
    let last_score = scores[scores.length() - 1];  // Assuming length method
    return (first_score + last_score) / 2;
}

Arrays in Complex Data Structures

struct Matrix {
    rows: &[&[i64]];  // Array of arrays
    width: i64;
    height: i64;
}

struct DataSet {
    values: &[f64];
    labels: &[str];
    metadata: &[bool];
}

Array Limitations and Considerations

  1. Fixed Size: Arrays have a fixed size determined at creation
  2. Bounds Checking: Accessing out-of-bounds indices may cause runtime errors
  3. Homogeneous: All elements must be the same type
  4. Reference Type: Arrays are references, not value types

Best Practices

  1. Initialize with known data: Prefer creating arrays with initial values
  2. Use meaningful names: Choose descriptive variable names for arrays
  3. Bounds awareness: Be careful with index calculations to avoid out-of-bounds access
  4. Type consistency: Ensure all elements are the same type
  5. Mutability: Only make arrays mutable when you need to modify elements
// Good practices:
let player_scores = &[95, 87, 92, 88];  // Descriptive name
let mut high_scores = &[100, 95, 90];   // Mutable only when needed

// Less ideal:
let a = &[1, 2, 3];                     // Non-descriptive name
let mut data = &[1, 2, 3];              // Unnecessary mutability

Structs

Structs in Y allow you to create custom data types by grouping related fields together. They're fundamental for organizing data and creating meaningful abstractions in your programs.

Struct Declaration

Define a struct using the struct keyword followed by field declarations:

struct Person {
    name: str;
    age: i64;
}

struct Point {
    x: f64;
    y: f64;
}

struct TestStruct {
    x: i64;
    bar: (i64, i64) -> i64;  // Function type as field
}

Each field has:

  • A name (identifier)
  • A type annotation
  • A semicolon terminator

Struct Instantiation

Create struct instances using struct literal syntax:

let person = Person {
    name: "Alice",
    age: 30
};

let origin = Point {
    x: 0.0,
    y: 0.0
};

// Using function references
let my_struct = TestStruct {
    x: 42,
    bar: add  // add is a function
};

All fields must be provided during instantiation.

Field Access

Access struct fields using dot notation:

let person = Person {
    name: "Bob",
    age: 25
};

let name = person.name;  // "Bob"
let age = person.age;    // 25

Mutable Structs

Structs can be mutable, allowing field modification:

let mut person = Person {
    name: "Charlie",
    age: 20
};

person.age = 21;        // Modify age field
person.name = "Chuck";  // Modify name field

Nested Structs

Structs can contain other structs as fields:

struct Address {
    street: str;
    city: str;
}

struct Person {
    name: str;
    address: Address;
}

let person = Person {
    name: "David",
    address: Address {
        street: "123 Main St",
        city: "Springfield"
    }
};

// Access nested fields
let city = person.address.city;

Functions as Struct Fields

Y allows function types as struct fields:

struct Calculator {
    operation: (i64, i64) -> i64;
    name: str;
}

fn add(x: i64, y: i64): i64 {
    x + y
}

let calc = Calculator {
    operation: add,
    name: "Adder"
};

// Call the function through the struct
let result = calc.operation(10, 20);  // 30

Real Examples from Y Code

Complex Struct Usage

struct FooStruct {
    id: i64;
    amount: f64;
}

struct Bar {
    t: TestStruct;
}

fn main(): void {
    let foo = FooStruct {
        id: 42,
        amount: 133.7
    };

    let mut b = Bar {
        t: TestStruct {
            x: 1337,
            bar: add
        }
    };

    // Nested field access and modification
    b.t.x = 42;

    // Calling function through nested struct
    b.t.bar(4, 2);
}

Assignment Example

struct Foo {
    b: i64;
}

fn b(): i64 {
    17
}

fn main(): void {
    let mut a = Foo {
        b: b()  // Function call in field initialization
    };

    a.b = 42;  // Direct field assignment
}

Methods on Structs

Structs can have methods defined through instance blocks:

struct FooStruct {
    id: i64;
    amount: f64;
}

instance FooStruct {
    fn get_id(): i64 {
        this.id
    }

    fn get_amount(): f64 {
        this.amount
    }

    fn set_amount(new_amount: f64): void {
        this.amount = new_amount;
    }
}

// Usage
let foo = FooStruct {
    id: 42,
    amount: 133.7
};

let id = foo.get_id();        // Method call
let amount = foo.get_amount(); // Another method call

Struct Patterns and Best Practices

// Good: Related fields grouped together
struct Circle {
    center_x: f64;
    center_y: f64;
    radius: f64;
}

// Better: Using nested structs for better organization
struct Point {
    x: f64;
    y: f64;
}

struct Circle {
    center: Point;
    radius: f64;
}

Using Structs as Parameters

fn calculate_area(circle: Circle): f64 {
    const PI: f64 = 3.14159;
    return PI * circle.radius * circle.radius;
}

fn move_point(point: Point, dx: f64, dy: f64): Point {
    Point {
        x: point.x + dx,
        y: point.y + dy
    }
}

Structs with Arrays

struct Student {
    name: str;
    grades: &[i64];
}

struct Class {
    name: str;
    students: &[Student];
}

let math_class = Class {
    name: "Mathematics",
    students: &[
        Student {
            name: "Alice",
            grades: &[95, 87, 92]
        },
        Student {
            name: "Bob",
            grades: &[88, 79, 94]
        }
    ]
};

Common Patterns

Builder Pattern

struct Config {
    debug: bool;
    port: i64;
    host: str;
}

fn create_config(): Config {
    Config {
        debug: false,
        port: 8080,
        host: "localhost"
    }
}

Data Transfer Objects

struct UserData {
    username: str;
    email: str;
    created_at: i64;  // timestamp
}

struct Response {
    status: i64;
    data: UserData;
    message: str;
}

Memory and Performance

  • Structs are value types - they contain the actual data
  • Field access is direct and efficient
  • Nested structs are stored inline
  • Mutable structs allow in-place modification

Best Practices

  1. Use descriptive names: Choose clear field and struct names
  2. Group related data: Keep related fields together in the same struct
  3. Consider immutability: Use mutable structs only when necessary
  4. Organize with nesting: Use nested structs for better data organization
  5. Document complex structures: Use comments for complex struct relationships
// Good struct design
struct BankAccount {
    account_number: str;
    balance: f64;
    owner: Person;      // Nested struct
    is_active: bool;
}

// Clear field access
let account = create_account();
let balance = account.balance;
let owner_name = account.owner.name;

Functions and Methods

Functions are fundamental building blocks in Y that allow you to organize code into reusable, testable units. Y treats functions as first-class values, supporting both named functions and anonymous lambda expressions.

Overview

Y supports several types of callable constructs:

  • Named Functions - Traditional function declarations with explicit names
  • Lambda Expressions - Anonymous functions that can be assigned to variables or passed as arguments
  • Instance Methods - Functions associated with specific types through instance blocks
  • External Declarations - Declarations for functions implemented outside Y (like C functions)

Function Features

Y functions support:

  • Explicit type signatures with parameter and return types
  • First-class values - functions can be stored in variables and passed as arguments
  • Expression-oriented - functions can end with expressions instead of explicit returns
  • Type inference in many contexts
  • Higher-order functions - functions that take or return other functions

Basic Function Syntax

fn function_name(param1: Type1, param2: Type2): ReturnType {
    // function body
    expression_or_return
}

Lambda Syntax

let lambda_var = \(param1, param2) => expression;

Method Syntax

instance TypeName {
    fn method_name(param: Type): ReturnType {
        // method body
    }
}

Example Usage

// Named function
fn add(x: i64, y: i64): i64 {
    x + y
}

// Lambda function
let multiply = \(x, y) => x * y;

// Function as struct field
struct Calculator {
    operation: (i64, i64) -> i64;
}

let calc = Calculator {
    operation: add
};

// Higher-order function
fn apply_twice(func: (i64) -> i64, value: i64): i64 {
    func(func(value))
}

The following sections provide detailed information about each type of function and method construct in Y.

Functions

Functions in Y allow you to encapsulate reusable code with explicit type signatures. They are first-class values that can be stored in variables, passed as arguments, and returned from other functions.

Function Declaration

The basic syntax for declaring a function:

fn function_name(parameter: Type): ReturnType {
    // function body
}

Simple Functions

fn greet(): void {
    printf("Hello, World!\n");
}

fn add(x: i64, y: i64): i64 {
    x + y
}

fn get_answer(): i64 {
    42
}

Function Parameters

Functions can accept multiple parameters with explicit types:

fn calculate_area(width: f64, height: f64): f64 {
    width * height
}

fn format_name(first: str, last: str): str {
    // String concatenation would be here if supported
    first  // For now, just return first name
}

Return Types

Functions must specify their return type:

fn divide(a: i64, b: i64): i64 {
    a / b
}

fn is_positive(x: i64): bool {
    x > 0
}

fn do_nothing(): void {
    // No return value
}

Return Statements

Functions can use explicit return statements or end with an expression:

// Explicit return
fn explicit_return_add(x: i64, y: i64): i64 {
    return x + y;
}

// Expression return (no semicolon on last line)
fn add(x: i64, y: i64): i64 {
    x + y
}

// Mixed approach
fn baz(x: i64): i64 {
    let intermediate = x * 2;
    return intermediate;
}

Functions as First-Class Values

Functions can be stored in variables and passed around:

fn add(x: i64, y: i64): i64 {
    x + y
}

fn multiply(x: i64, y: i64): i64 {
    x * y
}

// Store function in variable
let operation: (i64, i64) -> i64 = add;

// Use the function variable
let result = operation(10, 20);  // 30

// Function as struct field
struct Calculator {
    op: (i64, i64) -> i64;
    name: str;
}

let calc = Calculator {
    op: multiply,
    name: "Multiplier"
};

Higher-Order Functions

Functions can take other functions as parameters:

fn takes_function(func: (i64, i64) -> i64): i64 {
    func(42, 69)
}

fn apply_operation(x: i64, y: i64, op: (i64, i64) -> i64): i64 {
    op(x, y)
}

// Usage
let result1 = takes_function(add);       // 111
let result2 = apply_operation(10, 5, multiply);  // 50

Function Examples from Y Code

Basic Function Usage

declare printf: (str) -> void;

fn baz(x: i64): i64 {
    let intermediate = x * 2;
    return intermediate;
}

fn main(): i64 {
    printf("Foo\n");
    let x = 12;
    let a = baz(x);
    return x + a;
}

Functions Returning Functions

fn foobar(): (i64) -> i64 {
    return \(x) => x;  // Returns a lambda
}

fn create_adder(base: i64): (i64) -> i64 {
    return \(x) => x + base;
}

Functions with Complex Parameters

fn process_struct(data: TestStruct): i64 {
    return data.x;
}

fn takes_array(arr: &[i64]): i64 {
    arr[0]
}

Function Patterns

Factory Functions

fn create_point(x: f64, y: f64): Point {
    Point { x: x, y: y }
}

fn create_default_config(): Config {
    Config {
        debug: false,
        port: 8080,
        timeout: 30
    }
}

Utility Functions

fn max(a: i64, b: i64): i64 {
    if (a > b) { a } else { b }
}

fn clamp(value: i64, min: i64, max: i64): i64 {
    if (value < min) {
        min
    } else if (value > max) {
        max
    } else {
        value
    }
}

Recursive Functions

fn factorial(n: i64): i64 {
    if (n <= 1) {
        1
    } else {
        n * factorial(n - 1)
    }
}

fn fibonacci(n: i64): i64 {
    if (n <= 1) {
        n
    } else {
        fibonacci(n - 1) + fibonacci(n - 2)
    }
}

Function Type Signatures

Function types use the (param_types) -> return_type syntax:

// Function that takes no parameters and returns i64
let getter: () -> i64 = get_answer;

// Function that takes two i64s and returns i64
let binary_op: (i64, i64) -> i64 = add;

// Function that takes a function and returns i64
let higher_order: ((i64) -> i64) -> i64 = some_function;

// Complex function type
let complex: (str, &[i64], (i64) -> bool) -> &[str] = process_data;

Main Function

Every Y program must have a main function:

fn main(): i64 {
    // Program entry point
    // Return 0 for success, non-zero for error
    return 0;
}

// Or with void return
fn main(): void {
    // Program logic here
}

External Function Declarations

You can declare functions implemented externally (like C functions):

declare printf: (str) -> void;
declare malloc: (i64) -> void;
declare strlen: (str) -> i64;

// Usage
fn main(): void {
    printf("Hello from Y!\n");
}

Best Practices

Function Naming

// Good: Clear, descriptive names
fn calculate_total_price(items: &[Item]): f64 { ... }
fn is_valid_email(email: str): bool { ... }
fn format_currency(amount: f64): str { ... }

// Less ideal: Unclear names
fn calc(x: &[Item]): f64 { ... }
fn check(s: str): bool { ... }
fn fmt(n: f64): str { ... }

Function Size

Keep functions focused on a single responsibility:

// Good: Single responsibility
fn validate_age(age: i64): bool {
    age >= 0 && age <= 150
}

fn calculate_tax(amount: f64, rate: f64): f64 {
    amount * rate
}

// Better than one large function handling everything
fn process_order(order: Order): OrderResult {
    validate_order(order);
    calculate_total(order);
    apply_discounts(order);
    finalize_order(order)
}

Type Annotations

Always provide explicit type annotations for function parameters and return types:

// Good: Clear types
fn process_data(input: &[i64], threshold: i64): &[i64] { ... }

// Required: Y needs explicit function signatures
fn calculate(x: i64, y: i64): i64 { x + y }

Error Handling

Design functions to handle edge cases:

fn safe_divide(a: i64, b: i64): i64 {
    if (b == 0) {
        return 0;  // Or handle error appropriately
    } else {
        return a / b;
    }
}

fn get_array_element(arr: &[i64], index: i64): i64 {
    // Bounds checking would go here
    return arr[index];
}

Lambda Expressions

Lambda expressions in Y are anonymous functions that can be created inline. They provide a concise way to define functions without explicit names, especially useful for short operations and higher-order function programming.

Lambda Syntax

The basic lambda syntax uses the \ character followed by parameters and a body:

\(param1, param2) => expression

Simple Lambdas

// Identity function
let identity = \(x) => x;

// Simple arithmetic
let add_one = \(x) => x + 1;
let multiply = \(x, y) => x * y;

// Boolean operations
let is_even = \(n) => n % 2 == 0;

Lambda Types

Lambdas have function types that can be explicitly specified:

let doubler: (i64) -> i64 = \(x) => x * 2;
let comparator: (i64, i64) -> bool = \(a, b) => a > b;
let processor: (str) -> void = \(s) => printf(s);

Using Lambdas

As Variables

let square = \(n) => n * n;
let result = square(5);  // 25

let max = \(a, b) => if (a > b) { a } else { b };
let bigger = max(10, 20);  // 20

As Function Arguments

fn apply_to_number(n: i64, func: (i64) -> i64): i64 {
    func(n)
}

// Using lambda directly as argument
let result = apply_to_number(10, \(x) => x * 3);  // 30

fn apply_operation(x: i64, y: i64, op: (i64, i64) -> i64): i64 {
    op(x, y)
}

// Lambda for custom operations
let sum = apply_operation(5, 7, \(a, b) => a + b);      // 12
let product = apply_operation(3, 4, \(a, b) => a * b);  // 12

Lambdas in Structs

Lambdas can be stored in struct fields:

struct Processor {
    transform: (i64) -> i64;
    name: str;
}

let doubler_proc = Processor {
    transform: \(x) => x * 2,
    name: "Doubler"
};

let result = doubler_proc.transform(21);  // 42

Examples from Y Code

Basic Lambda Usage

fn main(): i64 {
    // Lambda with explicit type annotation
    let x: (i64) -> i64 = \(x) => x;

    // Using the lambda
    let result = x(42);

    return result;
}

Lambdas as Return Values

fn foobar(): (i64) -> i64 {
    return \(x) => x;  // Return lambda expression
}

fn create_multiplier(factor: i64): (i64) -> i64 {
    return \(x) => x * factor;
}

// Usage
let times_three = create_multiplier(3);
let result = times_three(10);  // 30

Lambdas with Complex Logic

// Multi-statement lambdas (using blocks)
let complex_processor = \(x) => {
    let doubled = x * 2;
    let incremented = doubled + 1;
    incremented
};

// Conditional lambdas
let abs_value = \(x) => if (x < 0) { -x } else { x };

// Lambda with multiple parameters
let distance = \(x1, y1, x2, y2) => {
    let dx = x2 - x1;
    let dy = y2 - y1;
    sqrt(dx * dx + dy * dy)
};

Higher-Order Programming

Function Composition

fn compose(f: (i64) -> i64, g: (i64) -> i64): (i64) -> i64 {
    return \(x) => f(g(x));
}

let add_one = \(x) => x + 1;
let double = \(x) => x * 2;

let add_then_double = compose(double, add_one);
let result = add_then_double(5);  // (5 + 1) * 2 = 12

Array Processing (Conceptual)

// If Y had higher-order array functions:
fn map(arr: &[i64], func: (i64) -> i64): &[i64] {
    // Implementation would go here
    return arr;  // Placeholder
}

// Usage with lambdas
let numbers = &[1, 2, 3, 4, 5];
let doubled = map(numbers, \(x) => x * 2);  // [2, 4, 6, 8, 10]
let squared = map(numbers, \(x) => x * x);  // [1, 4, 9, 16, 25]

Event Handlers (Conceptual)

struct Button {
    label: str;
    on_click: () -> void;
}

let save_button = Button {
    label: "Save",
    on_click: \() => printf("Saving data...\n")
};

let cancel_button = Button {
    label: "Cancel",
    on_click: \() => printf("Operation cancelled\n")
};

Practical Examples

Mathematical Operations

struct MathOperations {
    sin: (f64) -> f64;
    cos: (f64) -> f64;
    square: (f64) -> f64;
}

let math = MathOperations {
    sin: \(x) => external_sin(x),  // Assuming external function
    cos: \(x) => external_cos(x),
    square: \(x) => x * x
};

Data Transformation

fn transform_data(value: i64, transformer: (i64) -> i64): i64 {
    transformer(value)
}

// Different transformations
let normalized = transform_data(150, \(x) => x / 100);      // 1
let clamped = transform_data(150, \(x) => if (x > 100) { 100 } else { x });  // 100
let negated = transform_data(150, \(x) => -x);              // -150

Configuration and Callbacks

struct Config {
    validator: (str) -> bool;
    formatter: (str) -> str;
    processor: (str) -> void;
}

let email_config = Config {
    validator: \(email) => contains(email, "@"),  // Conceptual
    formatter: \(email) => to_lowercase(email),   // Conceptual
    processor: \(email) => send_email(email)      // Conceptual
};

Lambda Limitations

  1. Single Expression: Lambdas work best with single expressions (though blocks can be used)
  2. Type Inference: May need explicit type annotations in some contexts
  3. Closure Scope: Currently limited closure capabilities

Best Practices

When to Use Lambdas

// Good: Short, simple operations
let double = \(x) => x * 2;
let is_positive = \(x) => x > 0;

// Good: One-time use in function calls
process_array(data, \(x) => x + 1);

// Consider named function: Complex logic
fn complex_validation(input: str): bool {
    // Multiple checks and logic
    return true;  // Placeholder
}

Readability

// Good: Clear and concise
let operations = &[
    \(x) => x + 1,
    \(x) => x * 2,
    \(x) => x - 1
];

// Less readable: Too complex for lambda
let complex = \(x) => {
    let temp = x * 2;
    let adjusted = temp + 10;
    if (adjusted > 100) { 100 } else { adjusted }
};

// Better as named function:
fn complex_transform(x: i64): i64 {
    let temp = x * 2;
    let adjusted = temp + 10;
    if (adjusted > 100) { 100 } else { adjusted }
}

Type Clarity

// Good: Explicit types when needed
let processor: (str) -> bool = \(s) => validate(s);

// Good: Clear parameter names
let distance_calc = \(x1, y1, x2, y2) =>
    sqrt((x2 - x1) * (x2 - x1) + (y2 - y1) * (y2 - y1));

// Less clear: Unclear parameters
let calc = \(a, b, c, d) => sqrt((c - a) * (c - a) + (d - b) * (d - b));

Methods and Instances

Y supports object-oriented programming concepts through instance blocks, which allow you to define methods associated with specific types. This enables you to extend both built-in types and custom structs with additional functionality.

Instance Blocks

Instance blocks define methods for a specific type using the instance keyword:

instance TypeName {
    fn method_name(parameters): ReturnType {
        // method implementation
    }
}

Methods on Custom Structs

Basic Method Definition

struct FooStruct {
    id: i64;
    amount: f64;
}

instance FooStruct {
    fn get_id(): i64 {
        this.id
    }

    fn get_amount(): f64 {
        this.amount
    }

    fn set_amount(new_amount: f64): void {
        this.amount = new_amount;
    }
}

Using the this Keyword

The this keyword refers to the current instance:

struct TestStruct {
    x: i64;
    bar: (i64, i64) -> i64;
}

instance TestStruct {
    fn get_x(): i64 {
        return this.x;  // Access field through this
    }

    fn set_x(x: i64): void {
        this.x = x;     // Modify field through this
    }

    fn double_x(): i64 {
        this.x * 2     // Use field in computation
    }
}

Method Calls

Call methods using dot notation:

let foo = FooStruct {
    id: 42,
    amount: 133.7
};

let id = foo.get_id();        // 42
let amount = foo.get_amount(); // 133.7

foo.set_amount(200.0);
let new_amount = foo.get_amount(); // 200.0

External Method Declarations

You can declare methods that are implemented externally:

instance TestStruct {
    fn get_x(): i64 {
        return this.x;
    }

    declare get_id(): i64;  // External implementation
}

instance str {
    declare len(): i64;     // String length function
}

Methods on Built-in Types

Y allows extending built-in types with custom methods:

instance i64 {
    declare add(i64): i64;    // Custom addition
    declare multiply(i64): i64;
}

instance str {
    declare len(): i64;       // String length
    declare to_upper(): str;  // Convert to uppercase
}

// Usage
let text = "hello";
let length = text.len();      // Call method on string

Complex Method Examples

Methods with Business Logic

struct BankAccount {
    balance: f64;
    account_number: str;
    is_active: bool;
}

instance BankAccount {
    fn deposit(amount: f64): void {
        if (amount > 0.0) {
            this.balance = this.balance + amount;
        }
    }

    fn withdraw(amount: f64): bool {
        if (amount > 0.0 && amount <= this.balance && this.is_active) {
            this.balance = this.balance - amount;
            return true;
        } else {
            return false;
        }
    }

    fn get_balance(): f64 {
        if (this.is_active) {
            this.balance
        } else {
            0.0
        }
    }
}

Methods Using Other Methods

struct Point {
    x: f64;
    y: f64;
}

instance Point {
    fn distance_from_origin(): f64 {
        sqrt(this.x * this.x + this.y * this.y)
    }

    fn distance_from(other: Point): f64 {
        let dx = this.x - other.x;
        let dy = this.y - other.y;
        sqrt(dx * dx + dy * dy)
    }

    fn move_by(dx: f64, dy: f64): void {
        this.x = this.x + dx;
        this.y = this.y + dy;
    }

    fn normalize(): void {
        let distance = this.distance_from_origin();
        if (distance > 0.0) {
            this.x = this.x / distance;
            this.y = this.y / distance;
        }
    }
}

System Integration

Y supports system-level method declarations:

struct System {}

instance System {
    fn answer(): i64 {
        42
    }

    declare print(i64): void;   // External print function
}

declare Sys: System;  // Global system instance

fn main(): void {
    Sys.print(Sys.answer());   // Call system methods
}

Real Examples from Y Code

Struct with Methods and External Declarations

struct FooStruct {
    id: i64;
}

instance FooStruct {
    fn get_id(): i64 {
        this.id
    }
}

instance i64 {
    declare add(i64): i64;
}

struct System {}

instance System {
    fn answer(): i64 {
        42
    }

    declare print(i64): void;
}

declare Sys: System;

fn main(): void {
    Sys.print(Sys.answer());
}

Complete Workflow Example

fn main(): i64 {
    let my_struct = TestStruct {
        x: 42,
        bar: add
    };

    // Calling struct methods
    let value_of_x = my_struct.get_x();   // Get current value
    my_struct.set_x(1337);                // Set new value
    let new_value = my_struct.get_x();    // Get updated value

    // Using external method
    let id = my_struct.get_id();

    return 0;
}

Method Chaining

Methods can be designed for chaining (though return types must support it):

struct Builder {
    value: i64;
}

instance Builder {
    fn set_value(val: i64): Builder {
        this.value = val;
        return this;  // Return self for chaining
    }

    fn add(val: i64): Builder {
        this.value = this.value + val;
        return this;
    }

    fn build(): i64 {
        this.value
    }
}

// Usage (conceptual - depends on Y's ownership model)
let result = Builder { value: 0 }
    .set_value(10)
    .add(5)
    .build();  // 15

Best Practices

Method Organization

struct User {
    name: str;
    email: str;
    age: i64;
    is_active: bool;
}

instance User {
    // Getters
    fn get_name(): str { this.name }
    fn get_email(): str { this.email }
    fn get_age(): i64 { this.age }

    // Setters (with validation)
    fn set_age(new_age: i64): bool {
        if (new_age >= 0 && new_age <= 150) {
            this.age = new_age;
            return true;
        } else {
            return false;
        }
    }

    // Business logic
    fn is_adult(): bool {
        this.age >= 18
    }

    fn deactivate(): void {
        this.is_active = false;
    }
}

Method Naming

// Good: Clear, descriptive method names
instance BankAccount {
    fn get_balance(): f64 { ... }
    fn deposit(amount: f64): void { ... }
    fn is_account_active(): bool { ... }
    fn calculate_interest(rate: f64): f64 { ... }
}

// Less ideal: Unclear names
instance BankAccount {
    fn bal(): f64 { ... }
    fn add(amount: f64): void { ... }
    fn check(): bool { ... }
    fn calc(rate: f64): f64 { ... }
}

Error Handling in Methods

instance Calculator {
    fn divide(a: f64, b: f64): f64 {
        if (b == 0.0) {
            return 0.0;  // Or appropriate error handling
        } else {
            return a / b;
        }
    }

    fn safe_access_array(arr: &[i64], index: i64): i64 {
        // Bounds checking would go here
        return arr[index];
    }
}

Encapsulation

struct Counter {
    value: i64;
    max_value: i64;
}

instance Counter {
    // Public interface
    fn increment(): bool {
        if (this.can_increment()) {
            this.value = this.value + 1;
            return true;
        } else {
            return false;
        }
    }

    fn get_value(): i64 {
        this.value
    }

    // Helper method (conceptually private)
    fn can_increment(): bool {
        this.value < this.max_value
    }
}

Integration with Functions

Methods work seamlessly with regular functions:

fn process_user(user: User): void {
    if (user.is_adult()) {           // Method call
        let name = user.get_name();  // Another method call
        printf(name);                // Function call
    }
}

fn create_and_setup_user(): User {
    let mut user = User {
        name: "Alice",
        email: "alice@example.com",
        age: 25,
        is_active: true
    };

    user.set_age(26);  // Method call
    return user;       // Return the modified struct
}

Advanced Features

This section covers advanced language features in Y that provide additional flexibility and integration capabilities for complex programming scenarios.

Overview

Y includes several advanced features that go beyond basic programming constructs:

  • Constants - Compile-time constant values that cannot be changed
  • External Declarations - Interface with external libraries and system functions
  • Type Annotations - Explicit type specifications for complex scenarios

Constants

Constants in Y are immutable values known at compile time. They're useful for configuration values, mathematical constants, and other unchanging data:

const PI: f64 = 3.1415;
const MAX_USERS: i64 = 1000;
const DEBUG_MODE: bool = true;

External Declarations

The declare keyword allows you to interface with external functions, typically from C libraries or system functions:

declare printf: (str) -> void;
declare malloc: (i64) -> void;
declare strlen: (str) -> i64;

Type Annotations

While Y has type inference, explicit type annotations provide clarity and are required in certain contexts:

let processor: (str) -> bool = validate_input;
let numbers: &[i64] = &[];
let complex_type: ((i64) -> bool, &[str]) -> i64 = process_data;

Integration Features

These advanced features work together to enable:

  • System Integration - Interfacing with operating system functions
  • Library Integration - Using external C libraries
  • Performance Optimization - Compile-time constants for better optimization
  • Type Safety - Explicit type checking for complex function signatures

Example Usage

// Constants for configuration
const BUFFER_SIZE: i64 = 1024;
const DEFAULT_TIMEOUT: f64 = 30.0;

// External system functions
declare malloc: (i64) -> void;
declare free: (void) -> void;

// Complex type annotations
let data_processor: (str, (str) -> bool) -> &[str] = filter_strings;

fn main(): void {
    // Using constants
    let buffer = create_buffer(BUFFER_SIZE);

    // Using external functions
    malloc(BUFFER_SIZE);

    // Using complex types
    let filtered = data_processor("input", \(s) => is_valid(s));
}

These features enable Y to be both a high-level language with good abstractions and a systems programming language that can integrate closely with existing infrastructure.

Constants

Constants in Y are immutable values that are known at compile time. They provide a way to define unchanging values that can be used throughout your program, offering both clarity and potential performance benefits.

Constant Declaration

Constants are declared using the const keyword:

const CONSTANT_NAME: Type = value;

The value must be a compile-time constant expression.

Basic Constants

Numeric Constants

const PI: f64 = 3.1415;
const MAX_SIZE: i64 = 1000;
const DEFAULT_PORT: u32 = 8080;
const GOLDEN_RATIO: f64 = 1.618;

Boolean Constants

const DEBUG_MODE: bool = true;
const PRODUCTION_BUILD: bool = false;
const ENABLE_LOGGING: bool = true;

String Constants

const APPLICATION_NAME: str = "Y Lang Compiler";
const VERSION: str = "1.0.0";
const DEFAULT_CONFIG_FILE: str = "config.yml";

Character Constants

const SEPARATOR: char = ',';
const NEWLINE: char = '\n';
const TAB: char = '\t';

Using Constants

Constants can be used anywhere their type is expected:

const MAX_USERS: i64 = 100;
const TIMEOUT_SECONDS: f64 = 30.0;

fn create_user_pool(): UserPool {
    UserPool {
        capacity: MAX_USERS,
        timeout: TIMEOUT_SECONDS
    }
}

fn is_valid_user_count(count: i64): bool {
    count <= MAX_USERS
}

Examples from Y Code

Mathematical Constants

const PI: f64 = 3.1415;

fn calculate_circle_area(radius: f64): f64 {
    PI * radius * radius
}

fn calculate_circumference(radius: f64): f64 {
    2.0 * PI * radius
}

Configuration Constants

const BUFFER_SIZE: i64 = 1024;
const MAX_CONNECTIONS: i64 = 100;
const DEFAULT_TIMEOUT: f64 = 5.0;

fn create_server(): Server {
    Server {
        buffer_size: BUFFER_SIZE,
        max_connections: MAX_CONNECTIONS,
        timeout: DEFAULT_TIMEOUT
    }
}

Global Constants

Constants can be declared at the top level and used throughout your program:

const PROGRAM_VERSION: str = "2.1.0";
const MAX_RETRIES: i64 = 3;
const ERROR_THRESHOLD: f64 = 0.01;

fn main(): void {
    printf("Starting program version: ");
    printf(PROGRAM_VERSION);
    printf("\n");

    let config = Config {
        retries: MAX_RETRIES,
        threshold: ERROR_THRESHOLD
    };
}

Constants vs Variables

Constants

const FIXED_VALUE: i64 = 42;  // Cannot be changed
// FIXED_VALUE = 100;         // Error: cannot modify constant

Variables

let mut changeable_value: i64 = 42;  // Can be changed
changeable_value = 100;              // Valid: variable is mutable

Practical Examples

System Configuration

const SYSTEM_NAME: str = "Y Lang Runtime";
const VERSION_MAJOR: i64 = 1;
const VERSION_MINOR: i64 = 0;
const VERSION_PATCH: i64 = 0;

struct SystemInfo {
    name: str;
    version: str;
}

fn get_system_info(): SystemInfo {
    SystemInfo {
        name: SYSTEM_NAME,
        version: format_version(VERSION_MAJOR, VERSION_MINOR, VERSION_PATCH)
    }
}

Mathematical Operations

const E: f64 = 2.71828;           // Euler's number
const SQRT_2: f64 = 1.41421;      // Square root of 2
const GOLDEN_RATIO: f64 = 1.61803; // Golden ratio

fn exponential_growth(initial: f64, time: f64): f64 {
    initial * pow(E, time)
}

fn diagonal_length(side: f64): f64 {
    side * SQRT_2
}

Game Development

const SCREEN_WIDTH: i64 = 1920;
const SCREEN_HEIGHT: i64 = 1080;
const FPS: i64 = 60;
const GRAVITY: f64 = 9.81;

struct GameSettings {
    width: i64;
    height: i64;
    target_fps: i64;
    physics_gravity: f64;
}

fn create_game_settings(): GameSettings {
    GameSettings {
        width: SCREEN_WIDTH,
        height: SCREEN_HEIGHT,
        target_fps: FPS,
        physics_gravity: GRAVITY
    }
}

Network Configuration

const DEFAULT_HTTP_PORT: i64 = 80;
const DEFAULT_HTTPS_PORT: i64 = 443;
const MAX_PACKET_SIZE: i64 = 1500;
const CONNECTION_TIMEOUT: f64 = 10.0;

struct NetworkConfig {
    http_port: i64;
    https_port: i64;
    packet_size: i64;
    timeout: f64;
}

fn create_network_config(): NetworkConfig {
    NetworkConfig {
        http_port: DEFAULT_HTTP_PORT,
        https_port: DEFAULT_HTTPS_PORT,
        packet_size: MAX_PACKET_SIZE,
        timeout: CONNECTION_TIMEOUT
    }
}

Constants in Expressions

Constants can be used in any expression where their type is appropriate:

const BASE_SCORE: i64 = 100;
const MULTIPLIER: f64 = 1.5;

fn calculate_final_score(bonus: i64, time_factor: f64): f64 {
    (BASE_SCORE + bonus) * MULTIPLIER * time_factor
}

fn is_high_score(score: i64): bool {
    score > BASE_SCORE * 10
}

Constant Arrays (Conceptual)

While Y's current syntax may not support constant arrays directly, the concept would be:

// Conceptual - may not be currently supported
const PRIME_NUMBERS: &[i64] = &[2, 3, 5, 7, 11, 13, 17, 19];
const FIBONACCI: &[i64] = &[1, 1, 2, 3, 5, 8, 13, 21];

Best Practices

Naming Conventions

Use UPPER_CASE for constants to distinguish them from variables:

// Good: Clear constant naming
const MAX_BUFFER_SIZE: i64 = 4096;
const DEFAULT_CHARSET: str = "UTF-8";
const ENABLE_DEBUG: bool = false;

// Less ideal: Unclear naming
const maxBufferSize: i64 = 4096;  // Looks like a variable
const size: i64 = 4096;           // Too generic
// Database configuration
const DB_HOST: str = "localhost";
const DB_PORT: i64 = 5432;
const DB_NAME: str = "ylang_db";
const DB_TIMEOUT: f64 = 30.0;

// Graphics configuration
const WINDOW_WIDTH: i64 = 1024;
const WINDOW_HEIGHT: i64 = 768;
const REFRESH_RATE: i64 = 60;
const VSYNC_ENABLED: bool = true;

Documentation

Document the purpose and units of constants:

// Network timeout in seconds
const NETWORK_TIMEOUT: f64 = 30.0;

// Maximum file size in bytes (10 MB)
const MAX_FILE_SIZE: i64 = 10485760;

// Frame rate in frames per second
const TARGET_FPS: i64 = 60;

// Speed of light in meters per second
const SPEED_OF_LIGHT: f64 = 299792458.0;

Avoid Magic Numbers

Replace magic numbers with named constants:

// Bad: Magic numbers
fn process_data(data: &[i64]): bool {
    data.length() <= 1000 && data[0] > 42
}

// Good: Named constants
const MAX_DATA_SIZE: i64 = 1000;
const MIN_THRESHOLD: i64 = 42;

fn process_data(data: &[i64]): bool {
    data.length() <= MAX_DATA_SIZE && data[0] > MIN_THRESHOLD
}

External Declarations

External declarations in Y allow you to interface with functions and values implemented outside of Y, such as C library functions, system calls, or runtime-provided functionality. This enables Y programs to leverage existing libraries and system capabilities.

Declaration Syntax

Use the declare keyword to declare external functions and values:

declare function_name: (param_types) -> return_type;
declare variable_name: Type;

Function Declarations

Basic Function Declarations

declare printf: (str) -> void;
declare malloc: (i64) -> void;
declare strlen: (str) -> i64;
declare sqrt: (f64) -> f64;

Using Declared Functions

Once declared, external functions can be used like regular Y functions:

declare printf: (str) -> void;

fn main(): void {
    printf("Hello from Y!\n");
}

Examples from Y Code

System Functions

declare printf: (str) -> void;

fn baz(x: i64): i64 {
    let intermediate = x * 2;
    return intermediate;
}

fn main(): i64 {
    printf("Foo\n");  // Using external printf
    let x = 12;
    let a = baz(x);
    return x + a;
}

Variable Declarations

struct System {}

instance System {
    fn answer(): i64 {
        42
    }

    declare print(i64): void;  // External method
}

declare Sys: System;  // External global variable

fn main(): void {
    Sys.print(Sys.answer());  // Using external system
}

Method Declarations

External methods can be declared within instance blocks:

struct TestStruct {
    x: i64;
}

instance TestStruct {
    fn get_x(): i64 {
        this.x
    }

    declare get_id(): i64;  // External method implementation
}

instance str {
    declare len(): i64;     // External string length
}

instance i64 {
    declare add(i64): i64;  // External arithmetic operation
}

Common Use Cases

Standard Library Functions

// C standard library functions
declare malloc: (i64) -> void;
declare free: (void) -> void;
declare memcpy: (void, void, i64) -> void;
declare strcmp: (str, str) -> i64;

// Math library functions
declare sin: (f64) -> f64;
declare cos: (f64) -> f64;
declare sqrt: (f64) -> f64;
declare pow: (f64, f64) -> f64;

System Calls

// File operations
declare open: (str, i64) -> i64;
declare read: (i64, void, i64) -> i64;
declare write: (i64, void, i64) -> i64;
declare close: (i64) -> i64;

// Process operations
declare getpid: () -> i64;
declare exit: (i64) -> void;

Custom Runtime Functions

// Custom Y runtime functions
declare gc_collect: () -> void;
declare debug_print: (str) -> void;
declare get_timestamp: () -> i64;
declare allocate_array: (i64) -> void;

Integration Patterns

Wrapper Functions

Create Y functions that wrap external declarations for better ergonomics:

declare c_strlen: (str) -> i64;
declare c_strcmp: (str, str) -> i64;

fn string_length(s: str): i64 {
    c_strlen(s)
}

fn strings_equal(a: str, b: str): bool {
    c_strcmp(a, b) == 0
}

Error Handling

declare c_malloc: (i64) -> void;
declare c_free: (void) -> void;

fn safe_allocate(size: i64): void {
    if (size > 0) {
        return c_malloc(size);
    } else {
        return null;  // Or appropriate error handling
    }
}

Type-Safe Interfaces

// Raw external functions
declare raw_read_file: (str) -> void;
declare raw_write_file: (str, void) -> i64;

// Type-safe wrappers
fn read_text_file(filename: str): str {
    // Implementation that ensures str return type
    return convert_to_string(raw_read_file(filename));
}

fn write_text_file(filename: str, content: str): bool {
    let result = raw_write_file(filename, string_to_bytes(content));
    return result >= 0;
}

Platform-Specific Declarations

Unix/Linux

declare fork: () -> i64;
declare exec: (str, &[str]) -> i64;
declare waitpid: (i64, void, i64) -> i64;
declare signal: (i64, void) -> void;

Windows

declare CreateProcess: (str, str, void, void, bool, i64, void, str, void, void) -> bool;
declare CloseHandle: (void) -> bool;
declare GetLastError: () -> i64;

Real-World Integration Example

// Graphics library integration
declare sdl_init: (i64) -> i64;
declare sdl_create_window: (str, i64, i64, i64, i64, i64) -> void;
declare sdl_destroy_window: (void) -> void;
declare sdl_quit: () -> void;

struct Window {
    title: str;
    width: i64;
    height: i64;
    handle: void;  // Opaque handle
}

fn create_window(title: str, width: i64, height: i64): Window {
    let handle = sdl_create_window(title, 100, 100, width, height, 0);
    Window {
        title: title,
        width: width,
        height: height,
        handle: handle
    }
}

fn main(): i64 {
    if (sdl_init(0x20) < 0) {
        return 1;  // Error
    }

    let window = create_window("Y Lang App", 800, 600);

    // Main loop would go here

    sdl_destroy_window(window.handle);
    sdl_quit();
    return 0;
}

Network Programming

// Socket operations
declare socket: (i64, i64, i64) -> i64;
declare bind: (i64, void, i64) -> i64;
declare listen: (i64, i64) -> i64;
declare accept: (i64, void, void) -> i64;
declare send: (i64, void, i64, i64) -> i64;
declare recv: (i64, void, i64, i64) -> i64;

struct Server {
    socket_fd: i64;
    port: i64;
}

fn create_server(port: i64): Server {
    let sock = socket(2, 1, 0);  // AF_INET, SOCK_STREAM, 0
    // Additional setup would go here
    Server {
        socket_fd: sock,
        port: port
    }
}

Best Practices

Clear Naming

// Good: Clear external function names
declare c_printf: (str) -> void;
declare libc_malloc: (i64) -> void;
declare posix_open: (str, i64) -> i64;

// Less clear: Ambiguous names
declare func1: (str) -> void;
declare ext_call: (i64) -> void;

Documentation

// File system operations from POSIX
declare open: (str, i64) -> i64;    // Open file, returns file descriptor
declare read: (i64, void, i64) -> i64;  // Read from fd, returns bytes read
declare close: (i64) -> i64;        // Close file descriptor

// Graphics library bindings
declare gl_clear: (i64) -> void;    // Clear OpenGL buffers
declare gl_swap: () -> void;        // Swap front/back buffers

Error Handling

declare system_call: (i64) -> i64;

fn safe_system_call(arg: i64): bool {
    let result = system_call(arg);
    return result >= 0;  // Assuming negative values indicate errors
}

Type Safety

// Instead of raw pointers, use appropriate Y types when possible
declare unsafe_memory_op: (void, i64) -> void;

// Wrap in safer interface
fn safe_memory_operation(data: &[i64]): bool {
    if (data.length() > 0) {
        unsafe_memory_op(data_pointer(data), data.length());
        return true;
    } else {
        return false;
    }
}

Linking and Compilation

External declarations typically require:

  1. Header inclusion - Corresponding C headers during compilation
  2. Library linking - Linking against libraries that provide the implementations
  3. ABI compatibility - Ensuring parameter and return types match the external interface

Example build command:

yc program.why -lc -lm -lpthread  # Link against libc, libm, pthread

Type Annotations

Type annotations in Y provide explicit type information that helps with code clarity, type checking, and compiler optimization. While Y has type inference capabilities, explicit type annotations are required in certain contexts and recommended for complex types.

Basic Type Annotation Syntax

Type annotations use the colon (:) syntax:

let variable_name: Type = value;
fn function_name(param: Type): ReturnType { ... }

Variable Type Annotations

Simple Types

let age: i64 = 25;
let price: f64 = 99.99;
let name: str = "Alice";
let is_active: bool = true;
let grade: char = 'A';

When Type Inference Isn't Enough

// Empty arrays need explicit types
let numbers: &[i64] = &[];
let names: &[str] = &[];

// Ambiguous numeric literals
let count: u32 = 42;  // Specify u32 instead of default i64

Function Type Annotations

Function Parameters

All function parameters require type annotations:

fn calculate_area(width: f64, height: f64): f64 {
    width * height
}

fn process_user(user: User, active: bool): void {
    // Function implementation
}

Return Types

Functions must specify their return type:

fn get_name(): str {
    "Anonymous"
}

fn is_valid(input: str): bool {
    input.len() > 0
}

fn process_data(): void {
    // No return value
}

Complex Type Annotations

Function Types

Function types use the (param_types) -> return_type syntax:

// Function that takes two i64s and returns i64
let binary_op: (i64, i64) -> i64 = add;

// Function that takes no parameters and returns str
let getter: () -> str = get_default_name;

// Function that takes a function as parameter
let processor: ((i64) -> i64) -> i64 = apply_twice;

Array Types

Array types specify the element type:

let scores: &[i64] = &[95, 87, 92];
let names: &[str] = &["Alice", "Bob", "Charlie"];
let flags: &[bool] = &[true, false, true];

// Multidimensional arrays
let matrix: &[&[i64]] = &[&[1, 2], &[3, 4]];

Struct Types

Struct types use the struct name:

struct Point {
    x: f64;
    y: f64;
}

let origin: Point = Point { x: 0.0, y: 0.0 };
let location: Point = calculate_position();

Advanced Type Annotations

Higher-Order Function Types

Complex function types that involve functions as parameters or return values:

// Function that takes a function and returns a function
let transformer: ((i64) -> i64) -> ((i64) -> i64) = create_transformer;

// Function that takes multiple function parameters
let combiner: ((i64) -> i64, (i64) -> i64, i64) -> i64 = combine_operations;

// Function returning a function that takes a function
let factory: () -> ((str) -> bool) = create_validator;

Nested Function Types

// Function that processes arrays with custom logic
let array_processor: (&[i64], (i64) -> bool) -> &[i64] = filter_array;

// Event handler type
let event_handler: (str, () -> void) -> void = register_handler;

// Complex data transformer
let data_transform: (&[str], (str) -> str, (str) -> bool) -> &[str] = process_strings;

Examples from Y Code

Function Variables with Type Annotations

fn add(x: i64, y: i64): i64 {
    x + y
}

fn main(): i64 {
    // Explicit function type annotation
    let x: (i64) -> i64 = \(x) => x;

    // Using the annotated function
    let result = x(42);
    return result;
}

Struct with Function Fields

struct TestStruct {
    x: i64;
    bar: (i64, i64) -> i64;  // Function type in struct
}

let my_struct: TestStruct = TestStruct {
    x: 42,
    bar: add  // Function reference
};

Array with Type Annotation

fn main(): void {
    let x: &[i64] = &[];  // Empty array needs type annotation
    x[0];  // Access array element
}

Type Annotations in Practice

Configuration Objects

struct DatabaseConfig {
    host: str;
    port: i64;
    timeout: f64;
    ssl_enabled: bool;
}

struct ServerConfig {
    database: DatabaseConfig;
    max_connections: i64;
    request_handler: (str) -> str;  // Function type
}

let config: ServerConfig = ServerConfig {
    database: DatabaseConfig {
        host: "localhost",
        port: 5432,
        timeout: 30.0,
        ssl_enabled: true
    },
    max_connections: 100,
    request_handler: process_request
};

Generic-like Patterns

// Processing functions with explicit types
let string_processor: (str) -> str = normalize_string;
let number_processor: (i64) -> i64 = validate_number;
let array_processor: (&[i64]) -> &[i64] = sort_array;

// Data validation
let email_validator: (str) -> bool = is_valid_email;
let age_validator: (i64) -> bool = is_valid_age;
let password_validator: (str) -> bool = is_strong_password;

Callback Systems

struct EventSystem {
    on_click: () -> void;
    on_hover: (i64, i64) -> void;  // x, y coordinates
    on_key: (char) -> bool;        // key pressed, return if handled
}

let ui_events: EventSystem = EventSystem {
    on_click: handle_click,
    on_hover: \(x, y) => update_cursor(x, y),
    on_key: \(key) => process_key_input(key)
};

When Type Annotations Are Required

Function Signatures

// Required: All function parameters and return types
fn process(input: str, count: i64): bool {
    // Implementation
    return true;
}

Empty Collections

// Required: Empty arrays need type specification
let empty_numbers: &[i64] = &[];
let empty_strings: &[str] = &[];

Ambiguous Contexts

// Required when type can't be inferred
let processor: (str) -> bool = get_validator();
let data: &[i64] = get_array_data();

When Type Annotations Are Optional

Simple Variable Assignments

// Optional: Type can be inferred
let name = "Alice";     // Inferred as str
let age = 25;           // Inferred as i64
let price = 99.99;      // Inferred as f64
let active = true;      // Inferred as bool

Return Statements

// Type inferred from function signature
fn get_count(): i64 {
    return 42;  // Type inferred as i64
}

Best Practices

Clarity Over Brevity

// Good: Clear intent with explicit types
let user_validator: (User) -> bool = validate_user_data;
let error_handler: (str) -> void = log_error;

// Acceptable: When type is obvious
let name = "Alice";
let count = 42;

Complex Function Types

// Good: Break down complex types
type ValidationFunction = (str) -> bool;
type ProcessingFunction = (str, ValidationFunction) -> str;

let process_with_validation: ProcessingFunction = complex_processor;

// Less readable: Inline complex types
let processor: (str, (str) -> bool) -> str = complex_processor;

Consistent Style

// Good: Consistent annotation style
let config: Config = load_config();
let validator: (str) -> bool = create_validator();
let processor: (Config) -> Result = process_config;

// Inconsistent: Mixed annotation styles
let config = load_config();  // No annotation
let validator: (str) -> bool = create_validator();  // With annotation
let processor = process_config;  // No annotation

Documentation Through Types

// Good: Types that document intent
let error_logger: (str, i64) -> void = log_error_with_code;
let data_transformer: (&[str]) -> &[str] = clean_and_normalize;
let connection_manager: (str, i64) -> bool = establish_connection;

// Less clear: Generic types
let func1: (str, i64) -> void = log_error_with_code;
let func2: (&[str]) -> &[str] = clean_and_normalize;

Language Reference

This section provides comprehensive reference material for the Y programming language, covering grammar, syntax rules, operator precedence, and complete type information.

Reference Sections

Grammar Reference

Complete syntax rules and grammar definitions for all Y language constructs. This includes the formal grammar used by the parser and examples of valid syntax patterns.

Operator Precedence

Detailed information about operator precedence and associativity, helping you understand how complex expressions are evaluated and how to use parentheses effectively.

Built-in Types

Comprehensive reference for all built-in types in Y, including their properties, methods, size requirements, and usage patterns.

Quick Reference

Basic Syntax

// Variable declaration
let variable_name: Type = value;
let mut mutable_var: Type = value;

// Function declaration
fn function_name(param: Type): ReturnType {
    expression_or_statements
}

// Struct declaration
struct StructName {
    field: Type;
}

// Instance methods
instance StructName {
    fn method_name(): ReturnType {
        implementation
    }
}

// Constants
const CONSTANT_NAME: Type = value;

// External declarations
declare external_function: (Type) -> Type;

Type System

  • Primitive Types: i64, u32, f64, bool, char, str
  • Array Types: &[ElementType]
  • Function Types: (ParamTypes) -> ReturnType
  • User Types: Custom structs

Control Flow

// Conditional expressions
if (condition) { value1 } else { value2 }

// Loops
while (condition) {
    statements
}

Operators

  • Arithmetic: +, -, *, /
  • Comparison: ==, !=, <, >, <=, >=
  • Assignment: =
  • Access: . (property access), [] (array indexing)

Language Characteristics

  • Expression-oriented: Most constructs evaluate to values
  • Static typing: All types known at compile time
  • Memory safe: No manual memory management required
  • Functional features: First-class functions, lambdas
  • Immutable by default: Variables are immutable unless marked mut

This reference section provides the authoritative information for understanding Y's syntax, semantics, and type system.

Grammar Reference

This page provides the complete formal grammar for the Y programming language. The grammar is defined using a BNF-like notation and corresponds to the actual parser implementation.

Program Structure

Top-Level Program

Program := ToplevelStatement*

ToplevelStatement := FunctionDeclaration
                   | Constant
                   | Declaration
                   | StructDeclaration
                   | Instance
                   | Comment

Statements

Statement := FunctionDeclaration
           | VariableDeclaration
           | Assignment
           | WhileStatement
           | Constant
           | Expression ";"
           | YieldingExpression
           | Return
           | Declaration
           | StructDeclaration
           | Comment

Return := "return" Expression ";"

Declarations

Function Declaration

FunctionDeclaration := "fn" Identifier "(" ParameterList? ")" ":" TypeName Block

ParameterList := Parameter ("," Parameter)*
Parameter := Identifier ":" TypeName

Variable Declaration

VariableDeclaration := "let" "mut"? Identifier (":" TypeName)? "=" Expression ";"

Struct Declaration

StructDeclaration := "struct" Identifier "{" StructFieldDeclaration* "}"

StructFieldDeclaration := Identifier ":" TypeName ";"

Constant Declaration

Constant := "const" Identifier ":" TypeName "=" Expression ";"

External Declaration

Declaration := "declare" Identifier ":" TypeName ";"

Instance Declaration

Instance := "instance" TypeName "{" InstanceMember* "}"

InstanceMember := MethodDeclaration
                | MethodDeclare

MethodDeclaration := "fn" Identifier "(" ParameterList? ")" ":" TypeName Block
MethodDeclare := "declare" Identifier "(" ParameterList? ")" ":" TypeName ";"

Expressions

Expression Types

Expression := Boolean
            | Identifier
            | Number
            | String
            | Character
            | IfExpression
            | ParenthesizedExpression
            | BinaryExpression
            | Block
            | Lambda
            | Postfix
            | Prefix
            | Array
            | StructInitialisation

YieldingExpression := Expression  // Expression without semicolon terminator

Literals

Boolean := "true" | "false"

Number := Integer | Floating
Integer := [0-9]+
Floating := [0-9]+ "." [0-9]+

String := '"' ([^"\\] | \\.)* '"'
Character := "'" ([^'\\] | \\.) "'"

Identifier := [_a-zA-Z][_a-zA-Z0-9]*

Binary Expressions

BinaryExpression := Expression BinaryOperator Expression

BinaryOperator := "+"     // Addition (precedence 1, left-associative)
                | "-"     // Subtraction (precedence 1, left-associative)
                | "*"     // Multiplication (precedence 2, left-associative)
                | "/"     // Division (precedence 2, left-associative)
                | "=="    // Equality (precedence 0, left-associative)
                | "!="    // Inequality (precedence 0, left-associative)
                | "<"     // Less than (precedence 0, left-associative)
                | ">"     // Greater than (precedence 0, left-associative)
                | "<="    // Less or equal (precedence 0, left-associative)
                | ">="    // Greater or equal (precedence 0, left-associative)

Control Flow Expressions

IfExpression := "if" "(" Expression ")" Block ("else" Block)?

WhileStatement := "while" "(" Expression ")" Block

Block := "{" Statement* "}"

Function and Lambda Expressions

Lambda := "\" "(" ParameterList? ")" "=>" Expression

Postfix := Expression PostfixOperator

PostfixOperator := FunctionCall
                 | PropertyAccess
                 | IndexExpression

FunctionCall := "(" ArgumentList? ")"
ArgumentList := Expression ("," Expression)*

PropertyAccess := "." Identifier
IndexExpression := "[" Expression "]"

Data Structure Expressions

Array := "&" "[" (Expression ("," Expression)*)? "]"

StructInitialisation := Identifier "{" FieldInitList? "}"
FieldInitList := FieldInit ("," FieldInit)*
FieldInit := Identifier ":" Expression

ParenthesizedExpression := "(" Expression ")"

Assignment and L-Values

Assignment := LValue "=" Expression ";"

LValue := Identifier
        | PropertyAccess
        | IndexExpression

PropertyAccess := Expression "." Identifier
IndexExpression := Expression "[" Expression "]"

Type System

Type Names

TypeName := PrimitiveType
          | ArrayType
          | FunctionType
          | UserType

PrimitiveType := "i64" | "u32" | "f64" | "bool" | "char" | "str" | "void"

ArrayType := "&" "[" TypeName "]"

FunctionType := "(" (TypeName ("," TypeName)*)? ")" "->" TypeName

UserType := Identifier  // User-defined struct types

Type Annotations

TypeAnnotation := ":" TypeName

Comments

Comment := "//" [^\n]*

Operator Precedence

From highest to lowest precedence:

  1. Postfix operators (function calls, property access, array indexing)
  2. Prefix operators (unary operations)
  3. Multiplicative (*, /) - precedence 2, left-associative
  4. Additive (+, -) - precedence 1, left-associative
  5. Comparison (==, !=, <, >, <=, >=) - precedence 0, left-associative

Complete Grammar Examples

Function with All Features

// Function declaration with parameters and return type
fn complex_function(
    param1: i64,
    param2: &[str],
    callback: (str) -> bool
): &[str] {
    let mut results: &[str] = &[];
    let mut i = 0;

    while (i < param2.length()) {
        let current = param2[i];
        if (callback(current)) {
            results = append(results, current);
        }
        i = i + 1;
    }

    return results;
}

Struct with Instance Methods

struct ComplexStruct {
    id: i64;
    name: str;
    values: &[f64];
    processor: (f64) -> f64;
}

instance ComplexStruct {
    fn get_id(): i64 {
        this.id
    }

    fn process_values(): &[f64] {
        let mut results: &[f64] = &[];
        let mut i = 0;

        while (i < this.values.length()) {
            let processed = this.processor(this.values[i]);
            results = append(results, processed);
            i = i + 1;
        }

        return results;
    }

    declare external_method(str): bool;
}

Complex Expression

fn expression_example(): i64 {
    let result = if (condition1 && condition2) {
        calculate_value(
            array[index],
            \(x) => x * 2 + offset,
            struct_instance.method_call()
        )
    } else {
        default_value
    };

    return result;
}

Grammar Notes

Expression vs Statement Context

  • Expressions evaluate to values and can be used in expression contexts
  • Statements perform actions and include variable declarations, assignments, and control flow
  • Yielding expressions are expressions that serve as the final value of a block (no semicolon)

Semicolon Rules

  • Most statements end with semicolons
  • Expression statements end with semicolons
  • The last expression in a block can omit the semicolon (yielding expression)
  • Function bodies, if-else bodies, and while bodies use blocks

Type Inference

  • Variable types can often be inferred from the assigned expression
  • Function parameters and return types must be explicitly declared
  • Empty arrays require explicit type annotations

This grammar reference corresponds to the actual implementation in the Y compiler and can be used to understand the valid syntax for all language constructs.

Operator Precedence Reference

This page provides the complete operator precedence and associativity rules for Y. Understanding these rules is crucial for writing correct expressions and knowing when parentheses are needed.

Precedence Levels

Y operators are organized into precedence levels, with higher numbers indicating higher precedence (evaluated first).

PrecedenceOperatorsAssociativityDescription
5() [] .LeftPostfix operators
4- (unary)RightPrefix operators
3* /LeftMultiplicative
2+ -LeftAdditive
1== != < > <= >=LeftComparison
0=RightAssignment

Operator Details

Postfix Operators (Precedence 5)

These operators have the highest precedence and bind most tightly:

// Function calls
function_name(arg1, arg2)
object.method(parameter)

// Array indexing
array[index]
matrix[row][column]

// Property access
struct_instance.field
object.method

// Chaining postfix operators
user.get_address().get_street()[0]

Examples:

let result = calculate(x, y).format().length();  // ((calculate(x, y)).format()).length()
let value = data[i].process(flag);               // (data[i]).process(flag)

Prefix Operators (Precedence 4)

Currently limited to unary minus:

-expression    // Unary negation

Examples:

let negative = -42;           // Unary minus
let result = -function_call(); // -(function_call())
let value = -array[0];        // -(array[0])

Multiplicative Operators (Precedence 3)

expression * expression    // Multiplication
expression / expression    // Division

Left-associative: a * b * c equals (a * b) * c

Examples:

let area = width * height;
let rate = distance / time;
let complex = a * b / c * d;  // ((a * b) / c) * d

Additive Operators (Precedence 2)

expression + expression    // Addition
expression - expression    // Subtraction

Left-associative: a + b + c equals (a + b) + c

Examples:

let total = base + tax + tip;     // (base + tax) + tip
let difference = end - start;
let calculation = a + b - c + d;  // ((a + b) - c) + d

Comparison Operators (Precedence 1)

expression == expression   // Equality
expression != expression   // Inequality
expression < expression    // Less than
expression > expression    // Greater than
expression <= expression   // Less than or equal
expression >= expression   // Greater than or equal

Left-associative: Multiple comparisons chain left-to-right

Examples:

let is_equal = x == y;
let is_valid = age >= 18 && age <= 65;  // Note: && not shown in grammar
let in_range = min <= value && value <= max;

Assignment Operator (Precedence 0)

lvalue = expression    // Assignment

Right-associative: a = b = c equals a = (b = c)

Examples:

x = 42;
array[i] = value;
struct_instance.field = new_value;

Precedence Examples

Arithmetic Expressions

// Without parentheses
let result1 = 2 + 3 * 4;        // 2 + (3 * 4) = 14
let result2 = 10 - 6 / 2;       // 10 - (6 / 2) = 7
let result3 = a + b * c - d;    // a + (b * c) - d

// With explicit parentheses for clarity
let result4 = (2 + 3) * 4;      // 20
let result5 = (10 - 6) / 2;     // 2

Mixed Arithmetic and Comparison

// Arithmetic before comparison
let is_positive = x + y > 0;         // (x + y) > 0
let in_bounds = i * 2 < array.length(); // (i * 2) < (array.length())

// Explicit grouping
let complex_check = (a + b) * (c - d) >= threshold;

Function Calls and Property Access

// Postfix operators have highest precedence
let result = object.method().value + 10;    // ((object.method()).value) + 10
let data = array[index].process() * factor; // ((array[index]).process()) * factor

// Method chaining
let formatted = user.get_name().to_lower().trim();

Complex Expressions

// Multiple precedence levels
let complex = base + offset * scale > threshold;
// Parsed as: (base + (offset * scale)) > threshold

let calculation = func(a + b) * array[i] - constant;
// Parsed as: (func(a + b) * array[i]) - constant

let validation = struct_obj.validate(input.trim()) == expected_result;
// Parsed as: (struct_obj.validate(input.trim())) == expected_result

Common Precedence Pitfalls

Unary Minus vs Binary Minus

let a = 5;
let b = 3;

let result1 = a + -b;    // a + (-b) = 5 + (-3) = 2
let result2 = a - b;     // a - b = 5 - 3 = 2
let result3 = a+-b;      // Same as result1, but less readable

Array Access vs Multiplication

let array = &[1, 2, 3, 4];
let index = 1;
let multiplier = 2;

let value = array[index] * multiplier;  // (array[index]) * multiplier = 2 * 2 = 4
// NOT array[(index * multiplier)]

Function Calls vs Arithmetic

fn get_value(): i64 { 42 }
fn calculate(x: i64): i64 { x * 2 }

let result = get_value() + calculate(10);  // 42 + 20 = 62
let scaled = get_value() * 2 + 1;          // (42 * 2) + 1 = 85

Method Calls vs Comparison

struct Counter {
    value: i64;
}

instance Counter {
    fn get(): i64 { this.value }
}

let counter = Counter { value: 42 };
let is_large = counter.get() > 30;  // (counter.get()) > 30 = true

Best Practices

Use Parentheses for Clarity

Even when precedence rules make them unnecessary, parentheses can improve readability:

// Technically correct but potentially confusing
let result = a + b * c - d / e;

// Clearer with explicit grouping
let result = a + (b * c) - (d / e);

// Very clear with intermediate variables
let product = b * c;
let quotient = d / e;
let result = a + product - quotient;

Break Complex Expressions

// Hard to read
let complex = object.method(param1 + param2 * factor).process() > threshold && flag;

// Better
let adjusted_param = param2 * factor;
let method_result = object.method(param1 + adjusted_param);
let processed = method_result.process();
let is_valid = processed > threshold && flag;

Consistent Spacing

// Good: Consistent spacing helps show precedence
let result = a + b * c;
let check = value >= min && value <= max;

// Less clear: Inconsistent spacing
let result = a+b*c;
let check = value>=min&&value<=max;

Associativity Examples

Left Associativity

Most operators are left-associative:

// Addition (left-associative)
let sum = a + b + c + d;  // ((a + b) + c) + d

// Subtraction (left-associative)
let diff = a - b - c;     // (a - b) - c

// Multiplication (left-associative)
let product = a * b * c;  // (a * b) * c

// Division (left-associative)
let quotient = a / b / c; // (a / b) / c

Right Associativity

Assignment is right-associative:

// Assignment (right-associative)
let mut a: i64 = 0;
let mut b: i64 = 0;
let mut c: i64 = 0;

a = b = c = 42;  // a = (b = (c = 42))
// Result: a = 42, b = 42, c = 42

This precedence reference should help you write correct and readable Y expressions without unexpected operator precedence issues.

Built-in Types Reference

This page provides comprehensive reference information for all built-in types in the Y programming language, including their properties, memory layout, valid operations, and usage patterns.

Primitive Types

Integer Types

i64 - 64-bit Signed Integer

  • Size: 8 bytes (64 bits)
  • Range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
  • Default numeric type: Most integer literals default to i64
let count: i64 = 42;
let negative: i64 = -1000;
let max_value: i64 = 9223372036854775807;

Operations:

  • Arithmetic: +, -, *, /
  • Comparison: ==, !=, <, >, <=, >=
  • Assignment: =

u32 - 32-bit Unsigned Integer

  • Size: 4 bytes (32 bits)
  • Range: 0 to 4,294,967,295
  • Use case: When you need a smaller integer type or explicitly unsigned values
let port: u32 = 8080;
let size: u32 = 1024;
let max_u32: u32 = 4294967295;

Operations:

  • Arithmetic: +, -, *, /
  • Comparison: ==, !=, <, >, <=, >=
  • Assignment: =

Floating Point Types

f64 - 64-bit Floating Point

  • Size: 8 bytes (64 bits)
  • Range: ±1.7976931348623157E+308 (IEEE 754 double precision)
  • Precision: ~15-17 decimal digits
  • Default floating type: Floating literals default to f64
let pi: f64 = 3.14159265358979;
let price: f64 = 99.99;
let scientific: f64 = 1.23e-4;  // 0.000123

Operations:

  • Arithmetic: +, -, *, /
  • Comparison: ==, !=, <, >, <=, >=
  • Assignment: =

Special Values:

  • Positive infinity
  • Negative infinity
  • NaN (Not a Number)

Boolean Type

bool - Boolean

  • Size: 1 byte
  • Values: true or false
  • Use case: Logical operations, conditions, flags
let is_ready: bool = true;
let debug_mode: bool = false;
let result: bool = x > 0;

Operations:

  • Logical: &&, ||, ! (conceptual - may not be fully implemented)
  • Comparison: ==, !=
  • Assignment: =

Character Types

char - Unicode Character

  • Size: Variable (1-4 bytes UTF-8)
  • Range: Any valid Unicode code point
  • Use case: Single characters, text processing
let letter: char = 'a';
let digit: char = '5';
let unicode: char = '🚀';
let escape: char = '\n';

Operations:

  • Comparison: ==, !=, <, >, <=, >=
  • Assignment: =

Escape Sequences:

  • \n - Newline
  • \t - Tab
  • \\ - Backslash
  • \' - Single quote
  • \" - Double quote

str - String Slice

  • Size: Variable (UTF-8 encoded)
  • Properties: Immutable sequence of characters
  • Use case: Text data, string literals
let greeting: str = "Hello, World!";
let empty: str = "";
let multiline: str = "Line 1\nLine 2";

Operations:

  • Comparison: ==, !=, <, >, <=, >= (lexicographic)
  • Assignment: =
  • Property access: .len() (when available)

Composite Types

Array Types

&[T] - Array Reference

  • Size: Pointer to data + length information
  • Properties: Reference to a contiguous sequence of elements of type T
  • Mutability: Elements can be modified if array is mutable
let numbers: &[i64] = &[1, 2, 3, 4, 5];
let empty: &[str] = &[];
let mut mutable_array: &[i64] = &[10, 20, 30];

Operations:

  • Indexing: array[index]
  • Assignment (to elements): array[index] = value
  • Property access: .length() (conceptual)

Memory Layout:

&[i64] -> [pointer to data][length]
           |
           v
          [elem0][elem1][elem2]...

Function Types

(ParamTypes) -> ReturnType - Function Type

  • Size: Pointer size (8 bytes on 64-bit systems)
  • Properties: Reference to executable code
  • Use case: Function parameters, callbacks, stored procedures
// Function taking no parameters, returning i64
let getter: () -> i64 = get_value;

// Function taking two i64s, returning i64
let binary_op: (i64, i64) -> i64 = add;

// Function taking a function, returning i64
let higher_order: ((i64) -> i64) -> i64 = apply_twice;

Operations:

  • Function call: function(arguments)
  • Assignment: =
  • Comparison: ==, != (identity comparison)

Special Types

void - No Value

  • Size: 0 bytes
  • Use case: Functions that don't return a value
  • Note: Cannot be stored in variables, only used as return type
fn print_message(msg: str): void {
    printf(msg);
    // No return value
}

Type Conversion and Compatibility

Implicit Conversions

Y has no implicit type conversions. All type conversions must be explicit.

let int_val: i64 = 42;
let uint_val: u32 = 100;

// This would be an error:
// let result = int_val + uint_val;  // Error: type mismatch

// Explicit conversion required (conceptual):
// let result = int_val + to_i64(uint_val);

Type Compatibility Rules

  • Exact match required: Variables must match their declared types exactly
  • No automatic promotion: u32 doesn't automatically become i64
  • No automatic widening: f32 (if it existed) wouldn't become f64

Memory Layout and Alignment

Primitive Type Sizes

TypeSize (bytes)Alignment (bytes)
i6488
u3244
f6488
bool11
char1-41
strVariable1

Array Memory Layout

let array: &[i64] = &[1, 2, 3];

Memory layout:

array variable: [data_ptr: 8 bytes][length: 8 bytes]
                     |
                     v
heap/stack data: [1: 8 bytes][2: 8 bytes][3: 8 bytes]

Struct Memory Layout

struct Example {
    flag: bool;    // 1 byte
    count: i64;    // 8 bytes (may have 7 bytes padding before)
    value: f64;    // 8 bytes
}

Memory layout (with alignment):

[flag: 1 byte][padding: 7 bytes][count: 8 bytes][value: 8 bytes]
Total size: 24 bytes

Type Usage Patterns

Choosing Numeric Types

// Use i64 for general integer values
let age: i64 = 25;
let count: i64 = 1000;

// Use u32 for specific cases requiring unsigned values
let port: u32 = 8080;
let size: u32 = 1024;

// Use f64 for floating point calculations
let price: f64 = 99.99;
let percentage: f64 = 0.15;

Working with Strings and Characters

// String literals for text data
let message: str = "Hello, World!";
let filename: str = "data.txt";

// Characters for individual character processing
let separator: char = ',';
let newline: char = '\n';

// Arrays of characters for mutable text processing
let mut buffer: &[char] = &['H', 'e', 'l', 'l', 'o'];

Function Type Patterns

// Simple callback
let callback: () -> void = cleanup;

// Data processor
let processor: (str) -> str = normalize_text;

// Predicate function
let validator: (i64) -> bool = is_valid_age;

// Higher-order function
let mapper: (&[i64], (i64) -> i64) -> &[i64] = transform_array;

Type Limits and Constraints

Integer Overflow

Y's integer types have defined overflow behavior (implementation-dependent):

let max_i64: i64 = 9223372036854775807;
// let overflow = max_i64 + 1;  // Behavior depends on implementation

Floating Point Precision

let precise: f64 = 0.1 + 0.2;  // May not exactly equal 0.3
let comparison: bool = precise == 0.3;  // May be false due to precision

Array Bounds

let array: &[i64] = &[1, 2, 3];
let valid: i64 = array[0];     // Valid: index 0
let valid2: i64 = array[2];    // Valid: index 2
// let invalid = array[3];     // Runtime error: out of bounds

Best Practices

Type Selection

  • Use i64 for general-purpose integers
  • Use u32 only when you specifically need unsigned semantics
  • Use f64 for all floating-point calculations
  • Use bool for true/false values, not integers
  • Use char for individual characters, str for text

Type Annotations

// Explicit when type isn't obvious
let empty_array: &[i64] = &[];
let function_var: (i64) -> bool = validator;

// Can be omitted when obvious
let count = 42;           // Obviously i64
let name = "Alice";       // Obviously str
let ready = true;         // Obviously bool

Safe Operations

// Check bounds before array access
fn safe_get(array: &[i64], index: i64): i64 {
    if (index >= 0 && index < array.length()) {
        return array[index];
    } else {
        return 0;  // Or appropriate default/error handling
    }
}

// Validate function parameters
fn safe_divide(a: f64, b: f64): f64 {
    if (b != 0.0) {
        return a / b;
    } else {
        return 0.0;  // Or appropriate error handling
    }
}

Implementation

This section covers the implementation details of the Y programming language, focusing on code generation using Inkwell and LLVM. Each page provides comprehensive examples with conceptual explanations.

Sections

  • Foundation Concepts - Core LLVM abstractions, type system mapping, and architectural principles
  • Literals and Types - Implementation of primitive types, constants, and type conversions
  • Variables and Memory - Memory allocation, variable storage, scope management, and mutability
  • Operations - Binary operations, unary operations, comparisons, and type-specific behaviors
  • Functions - Function declaration, parameters, calls, returns, and calling conventions
  • Control Flow - If expressions, while loops, blocks, and advanced control constructs
  • Data Structures - Arrays, structs, tuples, and complex data manipulation
  • Advanced Constructs - Lambdas, closures, method calls, and advanced language features
  • Complete Examples - Full program examples demonstrating real-world usage patterns

Overview

Y Lang is implemented using Rust and leverages LLVM for code generation through the Inkwell library. The implementation follows a traditional compiler pipeline:

  1. Lexing - Tokenizing source code using pattern matching and regex
  2. Parsing - Building an Abstract Syntax Tree (AST) with grammar-driven development
  3. Type Checking - Two-phase semantic analysis with dependency resolution
  4. Code Generation - Converting typed AST to LLVM IR using Inkwell

Design Philosophy

The code generation focuses on:

  • Clarity over Performance: Readable LLVM IR generation that can be optimized later
  • Type Safety: Leveraging LLVM's type system to catch errors early
  • Debugging Support: Generating meaningful names and structured IR
  • Extensibility: Patterns that can accommodate future language features

Reading This Documentation

Each section builds upon previous concepts:

  • Start with Foundation Concepts to understand LLVM basics
  • Progress through the sections in order for comprehensive understanding
  • Reference Complete Examples to see how concepts combine
  • Use individual sections as reference material for specific constructs

The examples focus on the "why" behind implementation decisions, not just the "how".

Foundation Concepts

Understanding LLVM's core abstractions and design philosophy is essential for implementing Y Lang constructs effectively. This section covers the fundamental concepts that underpin all code generation.

LLVM Architecture Overview

LLVM separates compilation into distinct phases, each with clear responsibilities:

  1. Frontend (Y Lang parser/type checker) → LLVM IR
  2. Optimization passesOptimized LLVM IR
  3. BackendTarget assembly/machine code

Y Lang's code generator focuses on the first step: producing clean, correct LLVM IR that can be optimized and compiled to any target.

Core Abstractions

Context: The Global State Container

Why Context exists: LLVM types and constants are interned and cached globally. The Context ensures type identity and memory management across the entire compilation unit.

#![allow(unused)]
fn main() {
use inkwell::context::Context;

let context = Context::create();
// All types created from this context are compatible
let i64_type_1 = context.i64_type();
let i64_type_2 = context.i64_type();
assert_eq!(i64_type_1, i64_type_2); // Same type instance
}

Key principles:

  • One Context per compilation unit
  • All types from the same Context are compatible
  • Context owns the memory for types and constants
  • Thread-safe but not designed for concurrent modification

Module: The Compilation Unit

Why Modules exist: LLVM organizes code into modules, which represent single compilation units (like .c files). Modules contain functions, global variables, and metadata.

#![allow(unused)]
fn main() {
let module = context.create_module("my_program");

// Modules can contain:
// - Function declarations and definitions
// - Global variables and constants
// - Type definitions
// - Metadata (debug info, etc.)
}

Module organization in Y Lang:

  • One module per Y Lang source file
  • Global functions and constants declared at module level
  • Module name typically matches source file name

Builder: The Instruction Generator

Why Builder exists: LLVM instructions must be generated in sequence within basic blocks. The Builder manages this positioning and provides the API for instruction generation.

#![allow(unused)]
fn main() {
let builder = context.create_builder();

// Builder is always positioned within a basic block
// Instructions are inserted at the current position
let add_result = builder.build_int_add(left, right, "sum").unwrap();
// Next instruction will be inserted after the add
}

Builder positioning patterns:

#![allow(unused)]
fn main() {
// Position at end of basic block (most common)
builder.position_at_end(basic_block);

// Position before specific instruction (rare)
builder.position_before(&some_instruction);

// Always check current position when debugging
let current_block = builder.get_insert_block().unwrap();
}

Type System Mapping

Y Lang's type system maps to LLVM types with specific design decisions:

Primitive Type Mapping

Y Lang TypeLLVM TypeSizeReasoning
i64i6464 bitsDirect mapping, platform independent
f64double64 bitsIEEE 754 double precision
booli11 bitMinimal storage, efficient operations
chari88 bitsUTF-8 byte, composable into strings
() (void)void0 bitsRepresents no value
#![allow(unused)]
fn main() {
// Type creation examples
let i64_type = context.i64_type();
let f64_type = context.f64_type();
let bool_type = context.bool_type();
let void_type = context.void_type();
let ptr_type = context.ptr_type(Default::default()); // Opaque pointer
}

Reference and Pointer Types

Why pointers: Y Lang references map to LLVM pointers because LLVM doesn't have high-level reference semantics.

#![allow(unused)]
fn main() {
// Y Lang: &i64
// LLVM: ptr (opaque pointer to i64-sized memory)
let ptr_to_i64 = context.ptr_type(Default::default());

// All pointers are opaque in modern LLVM
// Type safety comes from how you load/store
}

Aggregate Types

Structs: Y Lang structs become LLVM struct types with named fields mapped to indices.

#![allow(unused)]
fn main() {
// Y Lang: struct Point { x: i64, y: i64 }
// LLVM: { i64, i64 }
let point_type = context.struct_type(&[
    i64_type.into(),  // field 0: x
    i64_type.into(),  // field 1: y
], false); // false = not packed
}

Arrays: Fixed-size homogeneous collections.

#![allow(unused)]
fn main() {
// Y Lang: &[i64; 5]
// LLVM: [5 x i64]
let array_type = i64_type.array_type(5);
}

Tuples: Anonymous structs with positional access.

#![allow(unused)]
fn main() {
// Y Lang: (i64, f64, bool)
// LLVM: { i64, double, i1 }
let tuple_type = context.struct_type(&[
    i64_type.into(),
    f64_type.into(),
    bool_type.into(),
], false);
}

Memory Model

Stack vs Heap Allocation

Stack allocation with alloca:

#![allow(unused)]
fn main() {
// Allocates on the current function's stack frame
let var_alloca = builder.build_alloca(i64_type, "local_var").unwrap();

// Properties:
// - Automatically deallocated when function returns
// - Fast allocation (just stack pointer adjustment)
// - Limited by stack size
// - Address is stable within function
}

Heap allocation patterns:

#![allow(unused)]
fn main() {
// Y Lang doesn't expose heap allocation directly
// But internal runtime functions might use malloc/free
// Example: dynamic strings, closures with captures
}

Memory Access Patterns

Load and Store:

#![allow(unused)]
fn main() {
let ptr = builder.build_alloca(i64_type, "var").unwrap();

// Store: memory[ptr] = value
let value = i64_type.const_int(42, false);
builder.build_store(ptr, value).unwrap();

// Load: value = memory[ptr]
let loaded = builder.build_load(i64_type, ptr, "loaded").unwrap();
}

GetElementPtr (GEP) for safe addressing:

#![allow(unused)]
fn main() {
// Access array element
let array_ptr = builder.build_alloca(array_type, "arr").unwrap();
let index = i64_type.const_int(2, false);
let element_ptr = unsafe {
    builder.build_gep(
        array_type,
        array_ptr,
        &[i64_type.const_int(0, false), index], // [base_offset, element_index]
        "element_ptr"
    ).unwrap()
};

// GEP calculates: base_ptr + (0 * sizeof(array)) + (2 * sizeof(i64))
// Result: pointer to array[2]
}

Value System

LLVM distinguishes between different value categories:

Constants vs Variables

Constants: Compile-time known values

#![allow(unused)]
fn main() {
let const_42 = i64_type.const_int(42, false);
let const_pi = f64_type.const_float(3.14159);
let const_true = bool_type.const_int(1, false);

// Constants can be used directly in operations
let const_sum = const_42.const_add(i64_type.const_int(8, false));
}

Variables: Runtime values that may change

#![allow(unused)]
fn main() {
// Variables require memory allocation and load/store
let var_alloca = builder.build_alloca(i64_type, "var").unwrap();
builder.build_store(var_alloca, const_42).unwrap();
let runtime_value = builder.build_load(i64_type, var_alloca, "value").unwrap();
}

Value Naming and SSA Form

Single Static Assignment (SSA): Each value is assigned exactly once

#![allow(unused)]
fn main() {
// Good: Each result has a unique name
let a = builder.build_int_add(x, y, "a").unwrap();
let b = builder.build_int_mul(a, z, "b").unwrap();
let c = builder.build_int_sub(b, w, "c").unwrap();

// LLVM IR:
// %a = add i64 %x, %y
// %b = mul i64 %a, %z
// %c = sub i64 %b, %w
}

Naming conventions:

  • Use descriptive names for debugging: "user_age", "total_cost"
  • Include operation context: "array_length", "loop_counter"
  • Temporary values can use generic names: "tmp", "result"

Basic Blocks and Control Flow

Basic Block Structure

What is a basic block: A sequence of instructions with:

  • Single entry point (at the beginning)
  • Single exit point (terminator instruction)
  • No jumps into the middle
#![allow(unused)]
fn main() {
let function = module.add_function("example", void_type.fn_type(&[], false), None);
let entry_block = context.append_basic_block(function, "entry");

builder.position_at_end(entry_block);
// Add instructions here...

// Every basic block must end with a terminator
builder.build_return(None).unwrap(); // ret void
}

Terminator Instructions

All basic blocks must end with exactly one terminator:

#![allow(unused)]
fn main() {
// Unconditional branch
builder.build_unconditional_branch(target_block).unwrap();

// Conditional branch
builder.build_conditional_branch(condition, then_block, else_block).unwrap();

// Return
builder.build_return(Some(&return_value)).unwrap();
builder.build_return(None).unwrap(); // void return

// Unreachable (for impossible code paths)
builder.build_unreachable().unwrap();
}

Control Flow Patterns

Sequential flow:

#![allow(unused)]
fn main() {
let block1 = context.append_basic_block(function, "block1");
let block2 = context.append_basic_block(function, "block2");

builder.position_at_end(block1);
// ... instructions ...
builder.build_unconditional_branch(block2).unwrap();

builder.position_at_end(block2);
// ... more instructions ...
builder.build_return(None).unwrap();
}

Conditional flow with merge:

#![allow(unused)]
fn main() {
let condition_block = context.append_basic_block(function, "condition");
let then_block = context.append_basic_block(function, "then");
let else_block = context.append_basic_block(function, "else");
let merge_block = context.append_basic_block(function, "merge");

// Condition evaluation
builder.position_at_end(condition_block);
let cond = /* ... evaluate condition ... */;
builder.build_conditional_branch(cond, then_block, else_block).unwrap();

// Then path
builder.position_at_end(then_block);
let then_value = /* ... compute then value ... */;
builder.build_unconditional_branch(merge_block).unwrap();

// Else path
builder.position_at_end(else_block);
let else_value = /* ... compute else value ... */;
builder.build_unconditional_branch(merge_block).unwrap();

// Merge point with PHI
builder.position_at_end(merge_block);
let phi = builder.build_phi(i64_type, "merged_value").unwrap();
phi.add_incoming(&[(&then_value, then_block), (&else_value, else_block)]);
}

Error Handling in Code Generation

Inkwell Error Patterns

Most Inkwell operations return Result types:

#![allow(unused)]
fn main() {
// Handle errors explicitly
match builder.build_int_add(left, right, "sum") {
    Ok(result) => result,
    Err(e) => panic!("Failed to build add instruction: {}", e),
}

// Or use unwrap() for prototype code
let result = builder.build_int_add(left, right, "sum").unwrap();

// Use expect() for better error messages
let result = builder.build_int_add(left, right, "sum")
    .expect("Integer addition should never fail with valid operands");
}

Common Error Conditions

  1. Type mismatches: Using incompatible types in operations
  2. Missing terminators: Basic blocks without terminator instructions
  3. Invalid positioning: Builder not positioned in a basic block
  4. Name conflicts: Reusing names in the same scope

Defensive Programming

#![allow(unused)]
fn main() {
// Verify builder is positioned
assert!(builder.get_insert_block().is_some(), "Builder must be positioned in a basic block");

// Verify types before operations
assert_eq!(left.get_type(), right.get_type(), "Operand types must match");

// Check for existing terminators
let current_block = builder.get_insert_block().unwrap();
if current_block.get_terminator().is_some() {
    panic!("Cannot add instructions after terminator");
}
}

Performance Considerations

Compile-Time Performance

  • Type interning: Context automatically interns types, so type creation is fast
  • Instruction building: Builder operations are relatively cheap
  • Memory usage: LLVM IR uses significant memory; avoid creating unnecessary instructions

Runtime Performance

  • Stack allocation: Prefer alloca over heap allocation when possible
  • Constant folding: Use LLVM constants for compile-time known values
  • Optimization passes: Generate simple IR and let LLVM optimize

Debugging and Development

#![allow(unused)]
fn main() {
// Use descriptive names for values and blocks
let user_age = builder.build_load(i64_type, age_ptr, "user_age").unwrap();
let is_adult = builder.build_int_compare(
    IntPredicate::SGE,
    user_age,
    i64_type.const_int(18, false),
    "is_adult"
).unwrap();

// Name basic blocks meaningfully
let check_age_block = context.append_basic_block(function, "check_age");
let adult_path_block = context.append_basic_block(function, "adult_path");
let minor_path_block = context.append_basic_block(function, "minor_path");
}

This foundation provides the conceptual framework for all Y Lang code generation. Understanding these patterns enables implementing any language construct effectively.

Literals and Types

This section covers implementing Y Lang's literal values and type system using Inkwell, focusing on the conceptual mapping between Y Lang constructs and LLVM representations.

Primitive Type Mapping

Y Lang's type system maps directly to LLVM's type system, with each Y Lang type having a clear LLVM counterpart.

Integer Literals

Why integers need explicit typing: LLVM requires all values to have concrete types. Y Lang's i64 maps to LLVM's i64 type, ensuring platform independence and consistent overflow behavior.

#![allow(unused)]
fn main() {
use inkwell::context::Context;

let context = Context::create();
let i64_type = context.i64_type();

// Creating integer constants
let zero = i64_type.const_int(0, false);     // false = unsigned interpretation
let positive = i64_type.const_int(42, false);
let negative = i64_type.const_int(-10i64 as u64, true); // true = signed interpretation
}

Generated LLVM IR:

%1 = i64 0
%2 = i64 42
%3 = i64 -10

Implementation steps:

  1. Get the target integer type from context
  2. Use const_int() with appropriate signedness flag
  3. Handle negative values by casting to unsigned representation

Floating Point Literals

Why separate float handling: LLVM distinguishes between integer and floating-point arithmetic to enable proper IEEE 754 compliance and optimization.

#![allow(unused)]
fn main() {
let f64_type = context.f64_type();

// Creating float constants
let pi = f64_type.const_float(3.14159);
let euler = f64_type.const_float(2.71828);
let negative_float = f64_type.const_float(-1.5);
}

Generated LLVM IR:

%1 = double 3.14159
%2 = double 2.71828
%3 = double -1.5

IEEE 754 considerations:

  • LLVM uses IEEE 754 double precision for f64
  • Special values (NaN, infinity) are handled automatically
  • Exact decimal representation may differ due to binary encoding

Boolean Literals

Why single-bit booleans: LLVM's i1 type enables efficient boolean operations and memory usage, with clear true/false semantics.

#![allow(unused)]
fn main() {
let bool_type = context.bool_type();

// Creating boolean constants
let true_val = bool_type.const_int(1, false);   // true = 1
let false_val = bool_type.const_int(0, false);  // false = 0

// Alternative using LLVM's built-in constants
let true_val_alt = bool_type.const_all_ones();  // all bits set = true
let false_val_alt = bool_type.const_zero();     // all bits clear = false
}

Generated LLVM IR:

%1 = i1 true
%2 = i1 false

Boolean semantics:

  • Only 0 is false, all other values are true
  • Comparisons and logical operations return i1 values
  • Can be promoted to larger integer types when needed

Character Literals

Why i8 for characters: Y Lang treats characters as UTF-8 bytes, allowing for efficient string composition and manipulation.

#![allow(unused)]
fn main() {
let i8_type = context.i8_type();

// Creating character constants
let char_a = i8_type.const_int(b'a' as u64, false);
let char_newline = i8_type.const_int(b'\n' as u64, false);
let char_unicode = i8_type.const_int(0xC3, false); // First byte of UTF-8 sequence
}

Generated LLVM IR:

%1 = i8 97      ; 'a'
%2 = i8 10      ; '\n'
%3 = i8 195     ; 0xC3

UTF-8 handling considerations:

  • Single-byte ASCII characters map directly
  • Multi-byte Unicode requires sequence handling
  • String operations must respect UTF-8 boundaries

String Literals

Why strings are complex: LLVM doesn't have a built-in string type. Strings are implemented as arrays of bytes with additional metadata.

Simple String Constants

#![allow(unused)]
fn main() {
// Method 1: Global string pointer (most common)
let hello_str = "Hello, World!";
let hello_global = builder.build_global_string_ptr(hello_str, "hello").unwrap();

// Method 2: String as array constant
let hello_bytes = hello_str.as_bytes();
let i8_type = context.i8_type();
let string_array_type = i8_type.array_type(hello_bytes.len() as u32);

let char_values: Vec<_> = hello_bytes.iter()
    .map(|&b| i8_type.const_int(b as u64, false))
    .collect();
let string_constant = i8_type.const_array(&char_values);
}

Generated LLVM IR:

; Method 1: Global string pointer
@hello = private constant [14 x i8] c"Hello, World!\00"

; Method 2: Array constant
%1 = [13 x i8] [i8 72, i8 101, i8 108, i8 108, i8 111, i8 44, i8 32,
                i8 87, i8 111, i8 114, i8 108, i8 100, i8 33]

String with Length Information

Y Lang strings likely need length information for bounds checking and iteration:

#![allow(unused)]
fn main() {
// String representation: { ptr, length }
let ptr_type = context.ptr_type(Default::default());
let string_struct_type = context.struct_type(&[
    ptr_type.into(),     // data pointer
    i64_type.into(),     // length
], false);

// Create string literal with metadata
let hello_ptr = builder.build_global_string_ptr("Hello", "hello_data").unwrap();
let hello_len = i64_type.const_int(5, false);

// Allocate string struct
let string_alloca = builder.build_alloca(string_struct_type, "string_literal").unwrap();

// Set data pointer
let ptr_field = builder.build_struct_gep(string_struct_type, string_alloca, 0, "data_ptr").unwrap();
builder.build_store(ptr_field, hello_ptr).unwrap();

// Set length
let len_field = builder.build_struct_gep(string_struct_type, string_alloca, 1, "len_ptr").unwrap();
builder.build_store(len_field, hello_len).unwrap();
}

Generated LLVM IR:

@hello_data = private constant [6 x i8] c"Hello\00"

%string_literal = alloca { ptr, i64 }
%data_ptr = getelementptr { ptr, i64 }, ptr %string_literal, i32 0, i32 0
store ptr @hello_data, ptr %data_ptr
%len_ptr = getelementptr { ptr, i64 }, ptr %string_literal, i32 0, i32 1
store i64 5, ptr %len_ptr

Void Type

Why void exists: Represents functions and expressions that don't return meaningful values, enabling proper type checking and optimization.

#![allow(unused)]
fn main() {
let void_type = context.void_type();

// Used in function signatures
let void_fn_type = void_type.fn_type(&[], false);

// Used in return statements
builder.build_return(None).unwrap(); // returns void
}

Generated LLVM IR:

define void @some_function() {
  ret void
}

Type Conversions

Y Lang requires explicit type conversions between different primitive types. LLVM provides specific instructions for each conversion type.

Integer Conversions

#![allow(unused)]
fn main() {
// Widening (safe)
let i32_type = context.i32_type();
let small_int = i32_type.const_int(42, false);
let widened = builder.build_int_z_extend(small_int, i64_type, "widened").unwrap();

// Narrowing (potentially unsafe)
let large_int = i64_type.const_int(1000, false);
let narrowed = builder.build_int_truncate(large_int, i32_type, "narrowed").unwrap();

// Signed vs unsigned interpretation
let signed_val = builder.build_int_s_extend(small_int, i64_type, "signed_ext").unwrap();
}

Generated LLVM IR:

%widened = zext i32 42 to i64
%narrowed = trunc i64 1000 to i32
%signed_ext = sext i32 42 to i64

Float-Integer Conversions

#![allow(unused)]
fn main() {
// Float to integer
let float_val = f64_type.const_float(3.14);
let float_to_int = builder.build_float_to_signed_int(float_val, i64_type, "f_to_i").unwrap();

// Integer to float
let int_val = i64_type.const_int(42, false);
let int_to_float = builder.build_signed_int_to_float(int_val, f64_type, "i_to_f").unwrap();
}

Generated LLVM IR:

%f_to_i = fptosi double 3.14 to i64
%i_to_f = sitofp i64 42 to double

Boolean Conversions

#![allow(unused)]
fn main() {
// Integer to boolean (zero = false, nonzero = true)
let int_val = i64_type.const_int(5, false);
let zero = i64_type.const_zero();
let to_bool = builder.build_int_compare(
    IntPredicate::NE,
    int_val,
    zero,
    "to_bool"
).unwrap();

// Boolean to integer
let bool_val = bool_type.const_int(1, false);
let to_int = builder.build_int_z_extend(bool_val, i64_type, "to_int").unwrap();
}

Generated LLVM IR:

%to_bool = icmp ne i64 5, 0
%to_int = zext i1 true to i64

Advanced Type Concepts

Type Equivalence and Compatibility

Why type checking matters: LLVM performs strict type checking. Operations between incompatible types will fail at IR generation time.

#![allow(unused)]
fn main() {
// These are the same type (interned by context)
let i64_a = context.i64_type();
let i64_b = context.i64_type();
assert_eq!(i64_a, i64_b); // Same type instance

// These are different types
let i32_type = context.i32_type();
// builder.build_int_add(i64_val, i32_val, "error"); // Would fail!
}

Type Size and Alignment

#![allow(unused)]
fn main() {
use inkwell::targets::{TargetData, TargetMachine};

// Get type sizes (requires target data)
let target_machine = Target::from_name("x86-64").unwrap()
    .create_target_machine(/* ... */).unwrap();
let target_data = target_machine.get_target_data();

let i64_size = i64_type.size_of().unwrap();
let string_size = string_struct_type.size_of().unwrap();
}

Opaque Pointers

Modern LLVM uses opaque pointers that don't encode pointee types:

#![allow(unused)]
fn main() {
// All pointers are the same type
let ptr_type = context.ptr_type(Default::default());

// Type information comes from load/store operations
let loaded_i64 = builder.build_load(i64_type, ptr, "loaded").unwrap();
let loaded_f64 = builder.build_load(f64_type, ptr, "loaded_f").unwrap();
}

Implementation Patterns

Literal Processing Pipeline

  1. Lexical Analysis: Recognize literal tokens (numbers, strings, booleans)
  2. Type Inference: Determine appropriate LLVM type
  3. Constant Creation: Generate LLVM constant values
  4. Type Checking: Verify compatibility with context

Dynamic vs Static Typing

Y Lang appears to use static typing with inference:

#![allow(unused)]
fn main() {
// Known at compile time
let static_val = i64_type.const_int(42, false);

// Runtime value (from variable, computation)
let runtime_val = builder.build_load(i64_type, some_ptr, "runtime").unwrap();

// Mixed operations
let result = builder.build_int_add(static_val, runtime_val, "mixed").unwrap();
}

Error Handling for Type Operations

#![allow(unused)]
fn main() {
// Safe type checking before operations
fn safe_add(builder: &Builder, left: IntValue, right: IntValue, name: &str) -> Result<IntValue, String> {
    if left.get_type() != right.get_type() {
        return Err(format!("Type mismatch: {:?} vs {:?}", left.get_type(), right.get_type()));
    }

    builder.build_int_add(left, right, name)
        .map_err(|e| format!("Failed to build add: {}", e))
}
}

Performance Considerations

Constant Folding

LLVM automatically performs constant folding:

#![allow(unused)]
fn main() {
// These will be computed at compile time
let a = i64_type.const_int(10, false);
let b = i64_type.const_int(20, false);
let sum = a.const_add(b); // Computed immediately, not at runtime
}

Type Interning

The Context automatically interns types, making type operations efficient:

#![allow(unused)]
fn main() {
// Multiple calls return the same type instance
let type1 = context.i64_type();
let type2 = context.i64_type();
// type1 and type2 are the same object in memory
}

Memory Layout Optimization

#![allow(unused)]
fn main() {
// Packed structs save memory but may hurt performance
let packed_struct = context.struct_type(&[
    i8_type.into(),
    i64_type.into(),
], true); // true = packed

// Regular structs have natural alignment
let aligned_struct = context.struct_type(&[
    i8_type.into(),
    i64_type.into(),
], false); // false = natural alignment
}

This comprehensive coverage of literals and types provides the foundation for implementing Y Lang's type system in LLVM, focusing on the conceptual reasoning behind each choice and common implementation patterns.

Variables and Memory

This section covers variable declaration, memory allocation, scope management, and mutability in Y Lang using Inkwell's memory management primitives.

Variable Declaration and Storage

Why variables need memory: Unlike constants which exist in the IR directly, variables represent mutable storage locations that can change during program execution. LLVM provides stack allocation through alloca instructions.

Basic Variable Declaration

Y Lang variables map to stack-allocated memory slots:

#![allow(unused)]
fn main() {
use inkwell::context::Context;

let context = Context::create();
let module = context.create_module("variables");
let builder = context.create_builder();

// Create a function context for variables
let i64_type = context.i64_type();
let fn_type = context.void_type().fn_type(&[], false);
let function = module.add_function("test_vars", fn_type, None);
let entry_block = context.append_basic_block(function, "entry");
builder.position_at_end(entry_block);

// Declare variable: let x: i64;
let x_alloca = builder.build_alloca(i64_type, "x").unwrap();
}

Generated LLVM IR:

define void @test_vars() {
entry:
  %x = alloca i64
  ret void
}

Implementation steps:

  1. Position builder in a function's basic block
  2. Use build_alloca to reserve stack space
  3. Store variable metadata (name, type) in symbol table
  4. Handle initialization separately

Variable Initialization

Variables can be initialized at declaration or later:

#![allow(unused)]
fn main() {
// Declare and initialize: let x = 42;
let x_alloca = builder.build_alloca(i64_type, "x").unwrap();
let initial_value = i64_type.const_int(42, false);
builder.build_store(x_alloca, initial_value).unwrap();

// Or initialize from another expression
let computed_value = builder.build_int_add(
    i64_type.const_int(10, false),
    i64_type.const_int(20, false),
    "computed"
).unwrap();
let y_alloca = builder.build_alloca(i64_type, "y").unwrap();
builder.build_store(y_alloca, computed_value).unwrap();
}

Generated LLVM IR:

%x = alloca i64
store i64 42, ptr %x
%computed = add i64 10, 20
%y = alloca i64
store i64 %computed, ptr %y

Variable Access (Loading)

Reading a variable's value requires loading from memory:

#![allow(unused)]
fn main() {
// Read variable value: x
let x_value = builder.build_load(i64_type, x_alloca, "x_val").unwrap();

// Use in expression: x + 10
let result = builder.build_int_add(
    x_value.into_int_value(),
    i64_type.const_int(10, false),
    "x_plus_10"
).unwrap();
}

Generated LLVM IR:

%x_val = load i64, ptr %x
%x_plus_10 = add i64 %x_val, 10

Mutability and Assignment

Y Lang distinguishes between immutable and mutable variables, but at the LLVM level, all variables are potentially mutable through their memory allocation.

Immutable Variables

Even "immutable" variables use alloca for consistency, but the type checker prevents reassignment:

#![allow(unused)]
fn main() {
// let x = 42; (immutable)
let x_alloca = builder.build_alloca(i64_type, "x").unwrap();
let value = i64_type.const_int(42, false);
builder.build_store(x_alloca, value).unwrap();

// Immutability enforced by Y Lang type checker, not LLVM
}

Mutable Variables

Mutable variables allow reassignment through additional store operations:

#![allow(unused)]
fn main() {
// let mut y = 10; (mutable)
let y_alloca = builder.build_alloca(i64_type, "y").unwrap();
let initial = i64_type.const_int(10, false);
builder.build_store(y_alloca, initial).unwrap();

// y = 20; (assignment)
let new_value = i64_type.const_int(20, false);
builder.build_store(y_alloca, new_value).unwrap();
}

Generated LLVM IR:

%y = alloca i64
store i64 10, ptr %y
store i64 20, ptr %y

Assignment Expressions

Y Lang assignment returns the assigned value:

#![allow(unused)]
fn main() {
// x = y = 42; (chained assignment)
let value = i64_type.const_int(42, false);

// y = 42
builder.build_store(y_alloca, value).unwrap();

// x = y (but we use the immediate value for efficiency)
builder.build_store(x_alloca, value).unwrap();

// The expression evaluates to the assigned value
let assignment_result = value; // Value of the assignment expression
}

Generated LLVM IR:

store i64 42, ptr %y
store i64 42, ptr %x
; %assignment_result is just 42 (the constant)

Scope Management

Variables have lexical scope that must be tracked during code generation.

Block Scopes

Y Lang blocks create new variable scopes:

#![allow(unused)]
fn main() {
// Outer scope
let outer_var = builder.build_alloca(i64_type, "outer").unwrap();

// Enter new block scope
// { let inner = 10; ... }
let inner_var = builder.build_alloca(i64_type, "inner").unwrap();
let inner_value = i64_type.const_int(10, false);
builder.build_store(inner_var, inner_value).unwrap();

// Exit block scope - variables still exist in LLVM but become inaccessible
// through symbol table management
}

Generated LLVM IR:

%outer = alloca i64
%inner = alloca i64
store i64 10, ptr %inner
; Both allocations persist until function return

Implementation pattern for scope management:

#![allow(unused)]
fn main() {
struct ScopeManager {
    scopes: Vec<HashMap<String, PointerValue<'ctx>>>,
}

impl ScopeManager {
    fn enter_scope(&mut self) {
        self.scopes.push(HashMap::new());
    }

    fn exit_scope(&mut self) {
        self.scopes.pop();
    }

    fn declare_variable(&mut self, name: String, alloca: PointerValue<'ctx>) -> Result<(), String> {
        let current_scope = self.scopes.last_mut()
            .ok_or("No active scope")?;

        if current_scope.contains_key(&name) {
            return Err(format!("Variable '{}' already declared in this scope", name));
        }

        current_scope.insert(name, alloca);
        Ok(())
    }

    fn lookup_variable(&self, name: &str) -> Option<PointerValue<'ctx>> {
        // Search scopes from innermost to outermost
        for scope in self.scopes.iter().rev() {
            if let Some(&alloca) = scope.get(name) {
                return Some(alloca);
            }
        }
        None
    }
}
}

Function Parameter Scope

Function parameters need special handling since they arrive as values, not allocations:

#![allow(unused)]
fn main() {
// Define function: fn add(a: i64, b: i64) -> i64
let param_types = vec![
    BasicMetadataTypeEnum::IntType(i64_type),
    BasicMetadataTypeEnum::IntType(i64_type),
];
let fn_type = i64_type.fn_type(&param_types, false);
let function = module.add_function("add", fn_type, None);

let entry_block = context.append_basic_block(function, "entry");
builder.position_at_end(entry_block);

// Parameters come as values, allocate them for mutability
let param_a = function.get_nth_param(0).unwrap().into_int_value();
let param_b = function.get_nth_param(1).unwrap().into_int_value();

let a_alloca = builder.build_alloca(i64_type, "a").unwrap();
let b_alloca = builder.build_alloca(i64_type, "b").unwrap();

builder.build_store(a_alloca, param_a).unwrap();
builder.build_store(b_alloca, param_b).unwrap();

// Now parameters can be used like local variables
let a_val = builder.build_load(i64_type, a_alloca, "a_val").unwrap();
let b_val = builder.build_load(i64_type, b_alloca, "b_val").unwrap();
let sum = builder.build_int_add(a_val.into_int_value(), b_val.into_int_value(), "sum").unwrap();

builder.build_return(Some(&sum)).unwrap();
}

Generated LLVM IR:

define i64 @add(i64 %0, i64 %1) {
entry:
  %a = alloca i64
  %b = alloca i64
  store i64 %0, ptr %a
  store i64 %1, ptr %b
  %a_val = load i64, ptr %a
  %b_val = load i64, ptr %b
  %sum = add i64 %a_val, %b_val
  ret i64 %sum
}

Memory Layout and Optimization

Stack Frame Organization

LLVM automatically manages stack frame layout, but understanding the principles helps with optimization:

#![allow(unused)]
fn main() {
// Multiple variable declarations create stack frame
let var1 = builder.build_alloca(i64_type, "var1").unwrap();    // 8 bytes
let var2 = builder.build_alloca(f64_type, "var2").unwrap();    // 8 bytes
let var3 = builder.build_alloca(bool_type, "var3").unwrap();   // 1 byte
let var4 = builder.build_alloca(i64_type, "var4").unwrap();    // 8 bytes

// LLVM will arrange these optimally in the stack frame
}

Generated LLVM IR:

%var1 = alloca i64      ; 8-byte aligned
%var2 = alloca double   ; 8-byte aligned
%var3 = alloca i1       ; 1-byte, but may be padded
%var4 = alloca i64      ; 8-byte aligned

Avoiding Unnecessary Allocations

For simple, non-reassigned variables, consider using SSA values directly:

#![allow(unused)]
fn main() {
// Instead of this (allocation-heavy):
let temp_alloca = builder.build_alloca(i64_type, "temp").unwrap();
let computed = builder.build_int_add(x, y, "computed").unwrap();
builder.build_store(temp_alloca, computed).unwrap();
let temp_val = builder.build_load(i64_type, temp_alloca, "temp_val").unwrap();

// Use this (direct SSA):
let computed = builder.build_int_add(x, y, "computed").unwrap();
// Use 'computed' directly in subsequent operations
}

Memory Access Patterns

Efficient memory access requires understanding when to load vs. reuse values:

#![allow(unused)]
fn main() {
// Inefficient: repeated loads
let x_val1 = builder.build_load(i64_type, x_alloca, "x1").unwrap();
let y_val1 = builder.build_load(i64_type, y_alloca, "y1").unwrap();
let result1 = builder.build_int_add(x_val1.into_int_value(), y_val1.into_int_value(), "r1").unwrap();

let x_val2 = builder.build_load(i64_type, x_alloca, "x2").unwrap(); // Redundant load
let result2 = builder.build_int_mul(x_val2.into_int_value(), i64_type.const_int(2, false), "r2").unwrap();

// Efficient: load once, reuse values
let x_val = builder.build_load(i64_type, x_alloca, "x").unwrap().into_int_value();
let y_val = builder.build_load(i64_type, y_alloca, "y").unwrap().into_int_value();
let result1 = builder.build_int_add(x_val, y_val, "r1").unwrap();
let result2 = builder.build_int_mul(x_val, i64_type.const_int(2, false), "r2").unwrap();
}

Advanced Memory Concepts

Composite Type Variables

Structs and arrays require more complex allocation patterns:

#![allow(unused)]
fn main() {
// Struct variable: let point = Point { x: 10, y: 20 };
let field_types = vec![i64_type.into(), i64_type.into()];
let point_type = context.struct_type(&field_types, false);
let point_alloca = builder.build_alloca(point_type, "point").unwrap();

// Initialize fields
let x_ptr = builder.build_struct_gep(point_type, point_alloca, 0, "x_ptr").unwrap();
let y_ptr = builder.build_struct_gep(point_type, point_alloca, 1, "y_ptr").unwrap();

builder.build_store(x_ptr, i64_type.const_int(10, false)).unwrap();
builder.build_store(y_ptr, i64_type.const_int(20, false)).unwrap();
}

Generated LLVM IR:

%point = alloca { i64, i64 }
%x_ptr = getelementptr { i64, i64 }, ptr %point, i32 0, i32 0
%y_ptr = getelementptr { i64, i64 }, ptr %point, i32 0, i32 1
store i64 10, ptr %x_ptr
store i64 20, ptr %y_ptr

Array Variables

Arrays use similar patterns with index-based access:

#![allow(unused)]
fn main() {
// Array variable: let arr = [1, 2, 3, 4, 5];
let array_type = i64_type.array_type(5);
let array_alloca = builder.build_alloca(array_type, "arr").unwrap();

// Initialize with constant array
let values = [1, 2, 3, 4, 5].map(|v| i64_type.const_int(v, false));
let array_constant = i64_type.const_array(&values);
builder.build_store(array_alloca, array_constant).unwrap();

// Access element: arr[2]
let zero = i64_type.const_int(0, false);
let index = i64_type.const_int(2, false);
let element_ptr = unsafe {
    builder.build_gep(array_type, array_alloca, &[zero, index], "elem_ptr").unwrap()
};
let element_val = builder.build_load(i64_type, element_ptr, "elem").unwrap();
}

Generated LLVM IR:

%arr = alloca [5 x i64]
store [5 x i64] [i64 1, i64 2, i64 3, i64 4, i64 5], ptr %arr
%elem_ptr = getelementptr [5 x i64], ptr %arr, i64 0, i64 2
%elem = load i64, ptr %elem_ptr

Reference Variables

Y Lang references map to pointers in LLVM:

#![allow(unused)]
fn main() {
// Reference variable: let ref_x = &x;
let ref_x_alloca = builder.build_alloca(ptr_type, "ref_x").unwrap();
builder.build_store(ref_x_alloca, x_alloca).unwrap();

// Dereferencing: *ref_x
let ptr_val = builder.build_load(ptr_type, ref_x_alloca, "ptr").unwrap();
let deref_val = builder.build_load(i64_type, ptr_val.into_pointer_value(), "deref").unwrap();
}

Generated LLVM IR:

%ref_x = alloca ptr
store ptr %x, ptr %ref_x
%ptr = load ptr, ptr %ref_x
%deref = load i64, ptr %ptr

Error Handling and Validation

Variable Lifecycle Validation

Track variable states to prevent common errors:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy)]
enum VariableState {
    Declared,      // Allocated but not initialized
    Initialized,   // Has a value
    Moved,         // Value has been moved (for move semantics)
}

struct VariableInfo<'ctx> {
    alloca: PointerValue<'ctx>,
    var_type: BasicTypeEnum<'ctx>,
    state: VariableState,
    is_mutable: bool,
}

impl VariableInfo<'_> {
    fn can_read(&self) -> bool {
        matches!(self.state, VariableState::Initialized)
    }

    fn can_assign(&self) -> bool {
        self.is_mutable && !matches!(self.state, VariableState::Moved)
    }
}
}

Type Safety in Variable Operations

Ensure type compatibility before memory operations:

#![allow(unused)]
fn main() {
fn safe_store<'ctx>(
    builder: &Builder<'ctx>,
    alloca: PointerValue<'ctx>,
    value: BasicValueEnum<'ctx>,
    expected_type: BasicTypeEnum<'ctx>
) -> Result<(), String> {
    if value.get_type() != expected_type {
        return Err(format!(
            "Type mismatch: expected {:?}, got {:?}",
            expected_type, value.get_type()
        ));
    }

    builder.build_store(alloca, value)
        .map_err(|e| format!("Store failed: {}", e))?;

    Ok(())
}
}

Memory Safety Considerations

While LLVM doesn't enforce memory safety automatically, implement patterns to prevent common issues:

#![allow(unused)]
fn main() {
// Bounds checking for array access
fn safe_array_access<'ctx>(
    builder: &Builder<'ctx>,
    array_alloca: PointerValue<'ctx>,
    array_type: ArrayType<'ctx>,
    index: IntValue<'ctx>,
    element_type: BasicTypeEnum<'ctx>
) -> Result<BasicValueEnum<'ctx>, String> {
    let array_len = array_type.len();
    let index_val = index.get_zero_extended_constant()
        .ok_or("Dynamic array access requires runtime bounds checking")?;

    if index_val >= array_len as u64 {
        return Err(format!("Array index {} out of bounds (length {})", index_val, array_len));
    }

    let zero = element_type.into_int_type().const_int(0, false);
    let element_ptr = unsafe {
        builder.build_gep(array_type, array_alloca, &[zero, index], "elem_ptr")
            .map_err(|e| format!("GEP failed: {}", e))?
    };

    builder.build_load(element_type, element_ptr, "element")
        .map_err(|e| format!("Load failed: {}", e))
}
}

Performance Optimization Strategies

Minimize Allocations

Use SSA form for temporary values:

#![allow(unused)]
fn main() {
// Good: Direct SSA computation
let a = builder.build_load(i64_type, a_alloca, "a").unwrap().into_int_value();
let b = builder.build_load(i64_type, b_alloca, "b").unwrap().into_int_value();
let temp1 = builder.build_int_add(a, b, "temp1").unwrap();
let temp2 = builder.build_int_mul(temp1, i64_type.const_int(2, false), "temp2").unwrap();
let result = builder.build_int_sub(temp2, i64_type.const_int(1, false), "result").unwrap();

// Avoid: Unnecessary allocations for temporaries
}

Leverage LLVM Optimizations

LLVM's optimization passes can eliminate redundant loads and stores:

#![allow(unused)]
fn main() {
// This pattern:
let var = builder.build_alloca(i64_type, "var").unwrap();
builder.build_store(var, i64_type.const_int(42, false)).unwrap();
let val = builder.build_load(i64_type, var, "val").unwrap();

// May be optimized to just:
// %val = i64 42
}

This comprehensive coverage of variables and memory management provides the foundation for implementing Y Lang's variable system in LLVM, emphasizing both correctness and performance considerations.

Operations

This section covers implementing Y Lang's arithmetic, logical, and comparison operations using Inkwell, focusing on type-specific instruction selection and proper handling of different numeric types.

Binary Arithmetic Operations

Why separate instructions for different types: LLVM uses distinct instructions for integers vs floats to enable proper optimization, overflow handling, and maintain IEEE 754 compliance for floating-point operations.

Integer Arithmetic

Y Lang's integer operations map directly to LLVM's integer arithmetic instructions:

#![allow(unused)]
fn main() {
use inkwell::context::Context;
use inkwell::IntPredicate;

let context = Context::create();
let builder = context.create_builder();
let i64_type = context.i64_type();

// Basic arithmetic operations
let left = i64_type.const_int(10, false);
let right = i64_type.const_int(3, false);

// Addition: 10 + 3
let add_result = builder.build_int_add(left, right, "add").unwrap();

// Subtraction: 10 - 3
let sub_result = builder.build_int_sub(left, right, "sub").unwrap();

// Multiplication: 10 * 3
let mul_result = builder.build_int_mul(left, right, "mul").unwrap();

// Division: 10 / 3 (signed)
let div_result = builder.build_int_signed_div(left, right, "div").unwrap();

// Remainder: 10 % 3 (signed)
let rem_result = builder.build_int_signed_rem(left, right, "rem").unwrap();
}

Generated LLVM IR:

%add = add i64 10, 3
%sub = sub i64 10, 3
%mul = mul i64 10, 3
%div = sdiv i64 10, 3
%rem = srem i64 10, 3

Implementation considerations:

  • Use signed_div for Y Lang integers to handle negative numbers correctly
  • unsigned_div would treat negative numbers as large positive values
  • Division by zero behavior: LLVM generates undefined behavior, consider runtime checks

Floating Point Arithmetic

Floating-point operations use separate instructions with IEEE 754 semantics:

#![allow(unused)]
fn main() {
let f64_type = context.f64_type();
let left_f = f64_type.const_float(10.5);
let right_f = f64_type.const_float(3.2);

// Floating-point arithmetic
let fadd_result = builder.build_float_add(left_f, right_f, "fadd").unwrap();
let fsub_result = builder.build_float_sub(left_f, right_f, "fsub").unwrap();
let fmul_result = builder.build_float_mul(left_f, right_f, "fmul").unwrap();
let fdiv_result = builder.build_float_div(left_f, right_f, "fdiv").unwrap();
let frem_result = builder.build_float_rem(left_f, right_f, "frem").unwrap();
}

Generated LLVM IR:

%fadd = fadd double 10.5, 3.2
%fsub = fsub double 10.5, 3.2
%fmul = fmul double 10.5, 3.2
%fdiv = fdiv double 10.5, 3.2
%frem = frem double 10.5, 3.2

IEEE 754 special cases:

  • Division by zero produces infinity, not undefined behavior
  • Operations with NaN propagate NaN
  • Overflow produces infinity rather than wrapping

Mixed-Type Arithmetic

Y Lang requires explicit type conversion for mixed-type operations:

#![allow(unused)]
fn main() {
// Convert integer to float for mixed arithmetic
let int_val = i64_type.const_int(42, false);
let float_val = f64_type.const_float(3.14);

// Convert int to float
let int_as_float = builder.build_signed_int_to_float(int_val, f64_type, "int_to_float").unwrap();

// Now can perform float arithmetic
let mixed_result = builder.build_float_add(int_as_float, float_val, "mixed_add").unwrap();
}

Generated LLVM IR:

%int_to_float = sitofp i64 42 to double
%mixed_add = fadd double %int_to_float, 3.14

Comparison Operations

Why comparison predicates matter: LLVM uses predicates to specify the exact comparison semantics, handling signed vs unsigned integers and NaN behavior for floats.

Integer Comparisons

#![allow(unused)]
fn main() {
let left = i64_type.const_int(10, false);
let right = i64_type.const_int(20, false);

// Equality: 10 == 20
let eq = builder.build_int_compare(IntPredicate::EQ, left, right, "eq").unwrap();

// Inequality: 10 != 20
let ne = builder.build_int_compare(IntPredicate::NE, left, right, "ne").unwrap();

// Signed comparisons
let slt = builder.build_int_compare(IntPredicate::SLT, left, right, "slt").unwrap(); // <
let sle = builder.build_int_compare(IntPredicate::SLE, left, right, "sle").unwrap(); // <=
let sgt = builder.build_int_compare(IntPredicate::SGT, left, right, "sgt").unwrap(); // >
let sge = builder.build_int_compare(IntPredicate::SGE, left, right, "sge").unwrap(); // >=

// Unsigned comparisons (if needed)
let ult = builder.build_int_compare(IntPredicate::ULT, left, right, "ult").unwrap();
}

Generated LLVM IR:

%eq = icmp eq i64 10, 20
%ne = icmp ne i64 10, 20
%slt = icmp slt i64 10, 20
%sle = icmp sle i64 10, 20
%sgt = icmp sgt i64 10, 20
%sge = icmp sge i64 10, 20
%ult = icmp ult i64 10, 20

Floating Point Comparisons

Float comparisons need special handling for NaN values:

#![allow(unused)]
fn main() {
use inkwell::FloatPredicate;

let left_f = f64_type.const_float(10.5);
let right_f = f64_type.const_float(20.3);

// Ordered comparisons (false if either operand is NaN)
let oeq = builder.build_float_compare(FloatPredicate::OEQ, left_f, right_f, "oeq").unwrap();
let olt = builder.build_float_compare(FloatPredicate::OLT, left_f, right_f, "olt").unwrap();
let ole = builder.build_float_compare(FloatPredicate::OLE, left_f, right_f, "ole").unwrap();
let ogt = builder.build_float_compare(FloatPredicate::OGT, left_f, right_f, "ogt").unwrap();
let oge = builder.build_float_compare(FloatPredicate::OGE, left_f, right_f, "oge").unwrap();

// Unordered comparisons (true if either operand is NaN)
let ueq = builder.build_float_compare(FloatPredicate::UEQ, left_f, right_f, "ueq").unwrap();
let une = builder.build_float_compare(FloatPredicate::UNE, left_f, right_f, "une").unwrap();
}

Generated LLVM IR:

%oeq = fcmp oeq double 10.5, 20.3
%olt = fcmp olt double 10.5, 20.3
%ole = fcmp ole double 10.5, 20.3
%ogt = fcmp ogt double 10.5, 20.3
%oge = fcmp oge double 10.5, 20.3
%ueq = fcmp ueq double 10.5, 20.3
%une = fcmp une double 10.5, 20.3

NaN handling choice: Most languages use ordered comparisons (O-prefixed) as the default, making NaN == NaN false.

Logical Operations

Why bitwise vs boolean logic: Y Lang distinguishes between bitwise operations on integers and logical operations on booleans.

Boolean Logic

#![allow(unused)]
fn main() {
let bool_type = context.bool_type();
let true_val = bool_type.const_int(1, false);
let false_val = bool_type.const_int(0, false);

// Logical AND: true && false
let and_result = builder.build_and(true_val, false_val, "and").unwrap();

// Logical OR: true || false
let or_result = builder.build_or(true_val, false_val, "or").unwrap();

// Logical NOT: !true
let not_result = builder.build_not(true_val, "not").unwrap();
}

Generated LLVM IR:

%and = and i1 true, false
%or = or i1 true, false
%not = xor i1 true, true  ; NOT implemented as XOR with all-ones

Short-Circuit Evaluation

Y Lang's && and || operators use short-circuit evaluation, requiring control flow:

#![allow(unused)]
fn main() {
// Implementing: a && b (short-circuit)
let a_cond = /* evaluate condition a */;

let and_true_block = context.append_basic_block(function, "and_true");
let and_merge_block = context.append_basic_block(function, "and_merge");

// If a is false, skip evaluating b
builder.build_conditional_branch(a_cond, and_true_block, and_merge_block).unwrap();

// Evaluate b only if a was true
builder.position_at_end(and_true_block);
let b_cond = /* evaluate condition b */;
builder.build_unconditional_branch(and_merge_block).unwrap();

// Merge results with PHI
builder.position_at_end(and_merge_block);
let phi = builder.build_phi(bool_type, "and_result").unwrap();
phi.add_incoming(&[
    (&bool_type.const_int(0, false), /* block where a was false */),
    (&b_cond, and_true_block)
]);
}

Generated LLVM IR:

br i1 %a_cond, label %and_true, label %and_merge

and_true:
  ; evaluate b_cond
  br label %and_merge

and_merge:
  %and_result = phi i1 [ false, %entry ], [ %b_cond, %and_true ]

Bitwise Operations

Integer bitwise operations for bit manipulation:

#![allow(unused)]
fn main() {
let left = i64_type.const_int(0b1010, false);  // 10 in binary
let right = i64_type.const_int(0b1100, false); // 12 in binary

// Bitwise AND: 1010 & 1100 = 1000
let bit_and = builder.build_and(left, right, "bit_and").unwrap();

// Bitwise OR: 1010 | 1100 = 1110
let bit_or = builder.build_or(left, right, "bit_or").unwrap();

// Bitwise XOR: 1010 ^ 1100 = 0110
let bit_xor = builder.build_xor(left, right, "bit_xor").unwrap();

// Bitwise NOT: ~1010 = ...11110101 (two's complement)
let bit_not = builder.build_not(left, "bit_not").unwrap();

// Left shift: 1010 << 2 = 101000
let shift_left = builder.build_left_shift(left, i64_type.const_int(2, false), "shl").unwrap();

// Right shift (arithmetic): 1010 >> 1 = 101
let shift_right = builder.build_right_shift(left, i64_type.const_int(1, false), true, "shr").unwrap();
}

Generated LLVM IR:

%bit_and = and i64 10, 12
%bit_or = or i64 10, 12
%bit_xor = xor i64 10, 12
%bit_not = xor i64 10, -1
%shl = shl i64 10, 2
%shr = ashr i64 10, 1  ; arithmetic right shift (preserves sign)

Unary Operations

Arithmetic Unary Operations

#![allow(unused)]
fn main() {
let value = i64_type.const_int(42, false);
let float_val = f64_type.const_float(3.14);

// Unary minus (negation)
let neg_int = builder.build_int_neg(value, "neg_int").unwrap();
let neg_float = builder.build_float_neg(float_val, "neg_float").unwrap();

// Unary plus (identity - no operation needed)
let pos_int = value; // Just use the value directly
}

Generated LLVM IR:

%neg_int = sub i64 0, 42      ; Negation as subtraction from zero
%neg_float = fneg double 3.14 ; Direct float negation

Type-Specific Considerations

Integer overflow behavior: LLVM integer operations wrap on overflow by default:

#![allow(unused)]
fn main() {
// This will wrap around for large values
let max_val = i64_type.const_int(i64::MAX as u64, false);
let one = i64_type.const_int(1, false);
let overflow = builder.build_int_add(max_val, one, "overflow").unwrap();
// Result wraps to i64::MIN
}

Overflow detection (if Y Lang needs it):

#![allow(unused)]
fn main() {
use inkwell::intrinsics::Intrinsic;

// Get overflow-checking intrinsic
let intrinsic = Intrinsic::find("llvm.sadd.with.overflow.i64").unwrap();
let intrinsic_fn = intrinsic.get_declaration(&module, &[i64_type.into()]).unwrap();

// Call with overflow detection
let args = vec![left.into(), right.into()];
let result = builder.build_call(intrinsic_fn, &args, "add_overflow").unwrap();

// Extract result and overflow flag
let sum = builder.build_extract_value(result.try_as_basic_value().left().unwrap().into_struct_value(), 0, "sum").unwrap();
let overflow_flag = builder.build_extract_value(result.try_as_basic_value().left().unwrap().into_struct_value(), 1, "overflow").unwrap();
}

Operator Precedence Implementation

Y Lang's operator precedence needs careful handling during parsing, but at the LLVM level, operations are explicit:

#![allow(unused)]
fn main() {
// Y Lang: a + b * c
// Parser ensures this becomes: a + (b * c)

let a = i64_type.const_int(5, false);
let b = i64_type.const_int(3, false);
let c = i64_type.const_int(2, false);

// First: b * c
let mul_result = builder.build_int_mul(b, c, "mul").unwrap();

// Then: a + (result)
let final_result = builder.build_int_add(a, mul_result, "add").unwrap();
}

Generated LLVM IR:

%mul = mul i64 3, 2
%add = add i64 5, %mul

Type Coercion and Promotion

Y Lang may need automatic type promotion in mixed operations:

#![allow(unused)]
fn main() {
// Promoting smaller integers to i64
let i32_type = context.i32_type();
let small_val = i32_type.const_int(100, false);
let large_val = i64_type.const_int(200, false);

// Promote i32 to i64
let promoted = builder.build_int_s_extend(small_val, i64_type, "promoted").unwrap();

// Now can operate
let result = builder.build_int_add(promoted, large_val, "result").unwrap();
}

Generated LLVM IR:

%promoted = sext i32 100 to i64
%result = add i64 %promoted, 200

Advanced Operation Patterns

Conditional Operations (Ternary-like)

LLVM's select instruction provides conditional value selection:

#![allow(unused)]
fn main() {
let condition = bool_type.const_int(1, false); // true
let true_val = i64_type.const_int(42, false);
let false_val = i64_type.const_int(24, false);

// condition ? true_val : false_val
let selected = builder.build_select(condition, true_val, false_val, "select").unwrap();
}

Generated LLVM IR:

%select = select i1 true, i64 42, i64 24

Pointer Arithmetic

For array indexing and memory operations:

#![allow(unused)]
fn main() {
// Array element access using GEP
let array_type = i64_type.array_type(10);
let array_ptr = builder.build_alloca(array_type, "array").unwrap();

let zero = i64_type.const_int(0, false);
let index = i64_type.const_int(5, false);

let element_ptr = unsafe {
    builder.build_gep(array_type, array_ptr, &[zero, index], "elem_ptr").unwrap()
};
}

Generated LLVM IR:

%array = alloca [10 x i64]
%elem_ptr = getelementptr [10 x i64], ptr %array, i64 0, i64 5

Error Handling and Validation

Runtime Division by Zero Checks

#![allow(unused)]
fn main() {
fn safe_divide<'ctx>(
    builder: &Builder<'ctx>,
    left: IntValue<'ctx>,
    right: IntValue<'ctx>,
    context: &'ctx Context
) -> IntValue<'ctx> {
    let i64_type = context.i64_type();
    let zero = i64_type.const_zero();

    // Check if divisor is zero
    let is_zero = builder.build_int_compare(
        IntPredicate::EQ,
        right,
        zero,
        "is_zero"
    ).unwrap();

    // Use select to avoid division by zero
    let safe_divisor = builder.build_select(
        is_zero,
        i64_type.const_int(1, false), // Use 1 if zero (or handle error differently)
        right,
        "safe_divisor"
    ).unwrap();

    builder.build_int_signed_div(left, safe_divisor.into_int_value(), "safe_div").unwrap()
}
}

Type Validation for Operations

#![allow(unused)]
fn main() {
fn validate_arithmetic_types<'ctx>(
    left_type: BasicTypeEnum<'ctx>,
    right_type: BasicTypeEnum<'ctx>
) -> Result<BasicTypeEnum<'ctx>, String> {
    match (left_type, right_type) {
        (BasicTypeEnum::IntType(l), BasicTypeEnum::IntType(r)) if l == r => Ok(left_type),
        (BasicTypeEnum::FloatType(l), BasicTypeEnum::FloatType(r)) if l == r => Ok(left_type),
        _ => Err(format!("Type mismatch in arithmetic: {:?} vs {:?}", left_type, right_type))
    }
}
}

Performance Optimization

Constant Folding

LLVM automatically folds constants, but be aware of the pattern:

#![allow(unused)]
fn main() {
// This gets computed at compile time
let a = i64_type.const_int(10, false);
let b = i64_type.const_int(20, false);
let c = a.const_add(b); // Immediate result: 30

// This requires runtime computation
let runtime_a = builder.build_load(i64_type, some_ptr, "a").unwrap().into_int_value();
let runtime_result = builder.build_int_add(runtime_a, b, "result").unwrap();
}

Strength Reduction

Some operations can be optimized by LLVM:

#![allow(unused)]
fn main() {
// Multiplication by power of 2 -> shift
let val = i64_type.const_int(42, false);
let mul_by_8 = builder.build_int_mul(val, i64_type.const_int(8, false), "mul8").unwrap();
// LLVM may optimize this to: shl i64 %val, 3
}

This comprehensive coverage of operations provides the foundation for implementing Y Lang's expression evaluation in LLVM, handling type safety, proper instruction selection, and performance considerations.

Functions

This section covers implementing Y Lang's function system using Inkwell, including function declaration, parameter handling, calls, returns, and calling conventions.

Function Declaration and Signatures

Why function types matter: LLVM requires explicit function signatures that define parameter types, return type, and calling conventions. This enables type checking, optimization, and proper code generation.

Basic Function Declaration

Y Lang functions map to LLVM functions with explicit type signatures:

#![allow(unused)]
fn main() {
use inkwell::context::Context;
use inkwell::types::BasicMetadataTypeEnum;

let context = Context::create();
let module = context.create_module("functions");
let builder = context.create_builder();

let i64_type = context.i64_type();
let f64_type = context.f64_type();
let void_type = context.void_type();

// Function: fn add(a: i64, b: i64) -> i64
let param_types = vec![
    BasicMetadataTypeEnum::IntType(i64_type),
    BasicMetadataTypeEnum::IntType(i64_type),
];
let fn_type = i64_type.fn_type(&param_types, false); // false = not variadic
let add_function = module.add_function("add", fn_type, None);
}

Generated LLVM IR:

declare i64 @add(i64, i64)

Implementation steps:

  1. Collect parameter types from Y Lang function signature
  2. Determine return type (void for no return)
  3. Create LLVM function type with fn_type()
  4. Add function to module with unique name

Function with No Parameters

#![allow(unused)]
fn main() {
// Function: fn get_answer() -> i64
let fn_type = i64_type.fn_type(&[], false); // Empty parameter list
let get_answer = module.add_function("get_answer", fn_type, None);
}

Generated LLVM IR:

declare i64 @get_answer()

Void Functions

#![allow(unused)]
fn main() {
// Function: fn print_hello() -> ()
let void_fn_type = void_type.fn_type(&[], false);
let print_hello = module.add_function("print_hello", void_fn_type, None);
}

Generated LLVM IR:

declare void @print_hello()

Function Implementation

Why basic blocks are required: LLVM functions must contain at least one basic block with a terminator instruction. The entry block is where execution begins.

Complete Function Implementation

#![allow(unused)]
fn main() {
// Implement: fn add(a: i64, b: i64) -> i64 { a + b }

// Create entry basic block
let entry_block = context.append_basic_block(add_function, "entry");
builder.position_at_end(entry_block);

// Access function parameters
let param_a = add_function.get_nth_param(0).unwrap().into_int_value();
let param_b = add_function.get_nth_param(1).unwrap().into_int_value();

// Allocate parameters for potential mutation (Y Lang semantics)
let a_alloca = builder.build_alloca(i64_type, "a").unwrap();
let b_alloca = builder.build_alloca(i64_type, "b").unwrap();

builder.build_store(a_alloca, param_a).unwrap();
builder.build_store(b_alloca, param_b).unwrap();

// Function body: a + b
let a_val = builder.build_load(i64_type, a_alloca, "a_val").unwrap().into_int_value();
let b_val = builder.build_load(i64_type, b_alloca, "b_val").unwrap().into_int_value();
let sum = builder.build_int_add(a_val, b_val, "sum").unwrap();

// Return result
builder.build_return(Some(&sum)).unwrap();
}

Generated LLVM IR:

define i64 @add(i64 %0, i64 %1) {
entry:
  %a = alloca i64
  %b = alloca i64
  store i64 %0, ptr %a
  store i64 %1, ptr %b
  %a_val = load i64, ptr %a
  %b_val = load i64, ptr %b
  %sum = add i64 %a_val, %b_val
  ret i64 %sum
}

Void Function Implementation

#![allow(unused)]
fn main() {
// Implement: fn print_number(n: i64) { /* side effects only */ }
let param_types = vec![BasicMetadataTypeEnum::IntType(i64_type)];
let fn_type = void_type.fn_type(&param_types, false);
let print_number = module.add_function("print_number", fn_type, None);

let entry_block = context.append_basic_block(print_number, "entry");
builder.position_at_end(entry_block);

// Access parameter
let param_n = print_number.get_nth_param(0).unwrap().into_int_value();

// Function body (side effects, I/O, etc.)
// ... implementation details ...

// Void return
builder.build_return(None).unwrap();
}

Generated LLVM IR:

define void @print_number(i64 %0) {
entry:
  ; function body
  ret void
}

Parameter Handling Strategies

Immutable Parameters (Default)

For parameters that won't be reassigned, direct use without allocation:

#![allow(unused)]
fn main() {
// Optimized version for immutable parameters
let entry_block = context.append_basic_block(add_function, "entry");
builder.position_at_end(entry_block);

let param_a = add_function.get_nth_param(0).unwrap().into_int_value();
let param_b = add_function.get_nth_param(1).unwrap().into_int_value();

// Direct use without allocation
let sum = builder.build_int_add(param_a, param_b, "sum").unwrap();
builder.build_return(Some(&sum)).unwrap();
}

Generated LLVM IR:

define i64 @add(i64 %0, i64 %1) {
entry:
  %sum = add i64 %0, %1
  ret i64 %sum
}

Mutable Parameters

For parameters that may be reassigned within the function:

#![allow(unused)]
fn main() {
// Function: fn increment_and_add(mut a: i64, b: i64) -> i64
let entry_block = context.append_basic_block(function, "entry");
builder.position_at_end(entry_block);

let param_a = function.get_nth_param(0).unwrap().into_int_value();
let param_b = function.get_nth_param(1).unwrap().into_int_value();

// Allocate only mutable parameter
let a_alloca = builder.build_alloca(i64_type, "a").unwrap();
builder.build_store(a_alloca, param_a).unwrap();

// Increment a
let a_val = builder.build_load(i64_type, a_alloca, "a_val").unwrap().into_int_value();
let incremented = builder.build_int_add(a_val, i64_type.const_int(1, false), "incremented").unwrap();
builder.build_store(a_alloca, incremented).unwrap();

// Add to b
let final_a = builder.build_load(i64_type, a_alloca, "final_a").unwrap().into_int_value();
let result = builder.build_int_add(final_a, param_b, "result").unwrap();

builder.build_return(Some(&result)).unwrap();
}

Reference Parameters

Y Lang references are passed as pointers:

#![allow(unused)]
fn main() {
// Function: fn increment_ref(ref: &mut i64)
let ptr_type = context.ptr_type(Default::default());
let param_types = vec![BasicMetadataTypeEnum::PointerType(ptr_type)];
let fn_type = void_type.fn_type(&param_types, false);
let increment_ref = module.add_function("increment_ref", fn_type, None);

let entry_block = context.append_basic_block(increment_ref, "entry");
builder.position_at_end(entry_block);

let ref_param = increment_ref.get_nth_param(0).unwrap().into_pointer_value();

// Load current value
let current = builder.build_load(i64_type, ref_param, "current").unwrap().into_int_value();

// Increment
let incremented = builder.build_int_add(current, i64_type.const_int(1, false), "incremented").unwrap();

// Store back
builder.build_store(ref_param, incremented).unwrap();
builder.build_return(None).unwrap();
}

Generated LLVM IR:

define void @increment_ref(ptr %0) {
entry:
  %current = load i64, ptr %0
  %incremented = add i64 %current, 1
  store i64 %incremented, ptr %0
  ret void
}

Function Calls

Why call instructions matter: LLVM call instructions handle argument passing, stack management, and return value handling according to the target's calling convention.

Basic Function Calls

#![allow(unused)]
fn main() {
// Call: add(10, 20)
let arg1 = i64_type.const_int(10, false);
let arg2 = i64_type.const_int(20, false);
let args = vec![arg1.into(), arg2.into()];

let call_result = builder.build_call(add_function, &args, "call_add").unwrap();
let return_value = call_result.try_as_basic_value().left().unwrap().into_int_value();

// Use return value
let doubled = builder.build_int_mul(return_value, i64_type.const_int(2, false), "doubled").unwrap();
}

Generated LLVM IR:

%call_add = call i64 @add(i64 10, i64 20)
%doubled = mul i64 %call_add, 2

Void Function Calls

#![allow(unused)]
fn main() {
// Call: print_number(42)
let arg = i64_type.const_int(42, false);
let args = vec![arg.into()];

builder.build_call(print_number, &args, "call_print").unwrap();
// No return value to handle
}

Generated LLVM IR:

call void @print_number(i64 42)

Nested Function Calls

#![allow(unused)]
fn main() {
// Call: add(add(1, 2), add(3, 4))
let inner1_args = vec![
    i64_type.const_int(1, false).into(),
    i64_type.const_int(2, false).into()
];
let inner1_result = builder.build_call(add_function, &inner1_args, "inner1").unwrap()
    .try_as_basic_value().left().unwrap();

let inner2_args = vec![
    i64_type.const_int(3, false).into(),
    i64_type.const_int(4, false).into()
];
let inner2_result = builder.build_call(add_function, &inner2_args, "inner2").unwrap()
    .try_as_basic_value().left().unwrap();

let outer_args = vec![inner1_result.into(), inner2_result.into()];
let final_result = builder.build_call(add_function, &outer_args, "outer").unwrap();
}

Generated LLVM IR:

%inner1 = call i64 @add(i64 1, i64 2)
%inner2 = call i64 @add(i64 3, i64 4)
%outer = call i64 @add(i64 %inner1, i64 %inner2)

Return Value Handling

Early Returns

Y Lang functions can have multiple return points:

#![allow(unused)]
fn main() {
// Function: fn abs(x: i64) -> i64
let fn_type = i64_type.fn_type(&[BasicMetadataTypeEnum::IntType(i64_type)], false);
let abs_function = module.add_function("abs", fn_type, None);

let entry_block = context.append_basic_block(abs_function, "entry");
let negative_block = context.append_basic_block(abs_function, "negative");
let positive_block = context.append_basic_block(abs_function, "positive");

builder.position_at_end(entry_block);

let param_x = abs_function.get_nth_param(0).unwrap().into_int_value();
let zero = i64_type.const_zero();

// Check if negative
let is_negative = builder.build_int_compare(
    IntPredicate::SLT,
    param_x,
    zero,
    "is_negative"
).unwrap();

builder.build_conditional_branch(is_negative, negative_block, positive_block).unwrap();

// Negative case: return -x
builder.position_at_end(negative_block);
let negated = builder.build_int_neg(param_x, "negated").unwrap();
builder.build_return(Some(&negated)).unwrap();

// Positive case: return x
builder.position_at_end(positive_block);
builder.build_return(Some(&param_x)).unwrap();
}

Generated LLVM IR:

define i64 @abs(i64 %0) {
entry:
  %is_negative = icmp slt i64 %0, 0
  br i1 %is_negative, label %negative, label %positive

negative:
  %negated = sub i64 0, %0
  ret i64 %negated

positive:
  ret i64 %0
}

Expression-Based Returns

Y Lang functions return the value of their last expression:

#![allow(unused)]
fn main() {
// Function body is a single expression
let entry_block = context.append_basic_block(function, "entry");
builder.position_at_end(entry_block);

// Function body expression evaluation
let result = /* ... evaluate expression ... */;

// Return the expression result
builder.build_return(Some(&result)).unwrap();
}

Function Overloading and Name Mangling

Y Lang may support function overloading, requiring name mangling:

#![allow(unused)]
fn main() {
// Original: fn add(a: i64, b: i64) -> i64
// Mangled: add_i64_i64_i64

fn mangle_function_name(name: &str, param_types: &[Type], return_type: &Type) -> String {
    let mut mangled = name.to_string();

    for param_type in param_types {
        mangled.push('_');
        mangled.push_str(&type_to_string(param_type));
    }

    mangled.push('_');
    mangled.push_str(&type_to_string(return_type));

    mangled
}

// Usage
let mangled_name = mangle_function_name("add", &[Type::I64, Type::I64], &Type::I64);
let function = module.add_function(&mangled_name, fn_type, None);
}

Recursive Functions

Recursive functions work naturally in LLVM due to function declarations:

#![allow(unused)]
fn main() {
// Function: fn factorial(n: i64) -> i64
let fn_type = i64_type.fn_type(&[BasicMetadataTypeEnum::IntType(i64_type)], false);
let factorial = module.add_function("factorial", fn_type, None);

let entry_block = context.append_basic_block(factorial, "entry");
let base_case_block = context.append_basic_block(factorial, "base_case");
let recursive_case_block = context.append_basic_block(factorial, "recursive_case");

builder.position_at_end(entry_block);

let param_n = factorial.get_nth_param(0).unwrap().into_int_value();
let one = i64_type.const_int(1, false);

// Check base case: n <= 1
let is_base_case = builder.build_int_compare(
    IntPredicate::SLE,
    param_n,
    one,
    "is_base_case"
).unwrap();

builder.build_conditional_branch(is_base_case, base_case_block, recursive_case_block).unwrap();

// Base case: return 1
builder.position_at_end(base_case_block);
builder.build_return(Some(&one)).unwrap();

// Recursive case: return n * factorial(n - 1)
builder.position_at_end(recursive_case_block);
let n_minus_1 = builder.build_int_sub(param_n, one, "n_minus_1").unwrap();

// Recursive call
let recursive_args = vec![n_minus_1.into()];
let recursive_result = builder.build_call(factorial, &recursive_args, "factorial_recursive").unwrap()
    .try_as_basic_value().left().unwrap().into_int_value();

let result = builder.build_int_mul(param_n, recursive_result, "result").unwrap();
builder.build_return(Some(&result)).unwrap();
}

Generated LLVM IR:

define i64 @factorial(i64 %0) {
entry:
  %is_base_case = icmp sle i64 %0, 1
  br i1 %is_base_case, label %base_case, label %recursive_case

base_case:
  ret i64 1

recursive_case:
  %n_minus_1 = sub i64 %0, 1
  %factorial_recursive = call i64 @factorial(i64 %n_minus_1)
  %result = mul i64 %0, %factorial_recursive
  ret i64 %result
}

Higher-Order Functions

Functions that take other functions as parameters:

#![allow(unused)]
fn main() {
// Function type for operation: fn(i64, i64) -> i64
let op_fn_type = i64_type.fn_type(&[
    BasicMetadataTypeEnum::IntType(i64_type),
    BasicMetadataTypeEnum::IntType(i64_type)
], false);

// Function: fn apply_op(op: fn(i64, i64) -> i64, a: i64, b: i64) -> i64
let fn_ptr_type = op_fn_type.ptr_type(Default::default());
let apply_op_type = i64_type.fn_type(&[
    BasicMetadataTypeEnum::PointerType(fn_ptr_type),
    BasicMetadataTypeEnum::IntType(i64_type),
    BasicMetadataTypeEnum::IntType(i64_type)
], false);

let apply_op = module.add_function("apply_op", apply_op_type, None);

let entry_block = context.append_basic_block(apply_op, "entry");
builder.position_at_end(entry_block);

let fn_param = apply_op.get_nth_param(0).unwrap().into_pointer_value();
let a_param = apply_op.get_nth_param(1).unwrap().into_int_value();
let b_param = apply_op.get_nth_param(2).unwrap().into_int_value();

// Call the function pointer
let args = vec![a_param.into(), b_param.into()];
let result = builder.build_indirect_call(op_fn_type, fn_param, &args, "indirect_call").unwrap()
    .try_as_basic_value().left().unwrap();

builder.build_return(Some(&result)).unwrap();
}

Generated LLVM IR:

define i64 @apply_op(ptr %0, i64 %1, i64 %2) {
entry:
  %indirect_call = call i64 %0(i64 %1, i64 %2)
  ret i64 %indirect_call
}

Error Handling in Functions

Validation and Assertions

#![allow(unused)]
fn main() {
fn validate_function_signature(
    name: &str,
    param_types: &[BasicTypeEnum],
    return_type: Option<BasicTypeEnum>
) -> Result<(), String> {
    if name.is_empty() {
        return Err("Function name cannot be empty".to_string());
    }

    if param_types.len() > 255 {
        return Err("Too many parameters (max 255)".to_string());
    }

    // Additional validation logic
    Ok(())
}
}

Safe Function Calls

#![allow(unused)]
fn main() {
fn safe_function_call<'ctx>(
    builder: &Builder<'ctx>,
    function: FunctionValue<'ctx>,
    args: &[BasicValueEnum<'ctx>],
    name: &str
) -> Result<Option<BasicValueEnum<'ctx>>, String> {
    let fn_type = function.get_type();
    let param_types = fn_type.get_param_types();

    if args.len() != param_types.len() {
        return Err(format!(
            "Argument count mismatch: expected {}, got {}",
            param_types.len(),
            args.len()
        ));
    }

    // Type checking
    for (i, (arg, expected_type)) in args.iter().zip(param_types.iter()).enumerate() {
        if arg.get_type() != *expected_type {
            return Err(format!(
                "Argument {} type mismatch: expected {:?}, got {:?}",
                i, expected_type, arg.get_type()
            ));
        }
    }

    let call_site = builder.build_call(function, args, name)
        .map_err(|e| format!("Call failed: {}", e))?;

    Ok(call_site.try_as_basic_value().left())
}
}

Optimization Considerations

Inlining Hints

#![allow(unused)]
fn main() {
use inkwell::attributes::{Attribute, AttributeLoc};

// Mark function for inlining
let inline_attr = context.create_enum_attribute(Attribute::get_named_enum_kind_id("alwaysinline"), 0);
function.add_attribute(AttributeLoc::Function, inline_attr);
}

Tail Call Optimization

#![allow(unused)]
fn main() {
// Enable tail call optimization for recursive functions
let call_site = builder.build_call(function, args, "tail_call").unwrap();
call_site.set_tail_call(true);
}

Function Attributes

#![allow(unused)]
fn main() {
// Mark function as pure (no side effects)
let readonly_attr = context.create_enum_attribute(Attribute::get_named_enum_kind_id("readonly"), 0);
function.add_attribute(AttributeLoc::Function, readonly_attr);

// Mark function as not throwing exceptions
let nounwind_attr = context.create_enum_attribute(Attribute::get_named_enum_kind_id("nounwind"), 0);
function.add_attribute(AttributeLoc::Function, nounwind_attr);
}

This comprehensive coverage of functions provides the foundation for implementing Y Lang's function system in LLVM, handling declaration, implementation, calls, and advanced patterns like recursion and higher-order functions.

Control Flow

This section covers implementing Y Lang's control flow constructs using Inkwell, focusing on conditional expressions, loops, blocks, and advanced control patterns using LLVM's basic block system.

Conditional Expressions (If-Else)

Why basic blocks for conditionals: LLVM represents control flow as graphs of basic blocks connected by branches. Each path through an if-else requires separate basic blocks to maintain proper SSA form.

Simple If Expression

Y Lang's if expressions evaluate to values, requiring careful handling of result values:

#![allow(unused)]
fn main() {
use inkwell::context::Context;
use inkwell::IntPredicate;

let context = Context::create();
let module = context.create_module("conditionals");
let builder = context.create_builder();

let i64_type = context.i64_type();
let bool_type = context.bool_type();

// Function context
let fn_type = i64_type.fn_type(&[], false);
let function = module.add_function("test_if", fn_type, None);

// Create basic blocks
let entry_block = context.append_basic_block(function, "entry");
let then_block = context.append_basic_block(function, "then");
let else_block = context.append_basic_block(function, "else");
let merge_block = context.append_basic_block(function, "merge");

builder.position_at_end(entry_block);

// Evaluate condition: x > 10
let x = i64_type.const_int(15, false);
let ten = i64_type.const_int(10, false);
let condition = builder.build_int_compare(
    IntPredicate::SGT,
    x,
    ten,
    "x_gt_10"
).unwrap();

// Branch based on condition
builder.build_conditional_branch(condition, then_block, else_block).unwrap();

// Then branch: if x > 10 { 42 }
builder.position_at_end(then_block);
let then_value = i64_type.const_int(42, false);
builder.build_unconditional_branch(merge_block).unwrap();

// Else branch: else { 0 }
builder.position_at_end(else_block);
let else_value = i64_type.const_int(0, false);
builder.build_unconditional_branch(merge_block).unwrap();

// Merge block: combine results with PHI
builder.position_at_end(merge_block);
let phi = builder.build_phi(i64_type, "if_result").unwrap();
phi.add_incoming(&[
    (&then_value, then_block),
    (&else_value, else_block)
]);

builder.build_return(Some(&phi.as_basic_value())).unwrap();
}

Generated LLVM IR:

define i64 @test_if() {
entry:
  %x_gt_10 = icmp sgt i64 15, 10
  br i1 %x_gt_10, label %then, label %else

then:
  br label %merge

else:
  br label %merge

merge:
  %if_result = phi i64 [ 42, %then ], [ 0, %else ]
  ret i64 %if_result
}

Implementation steps:

  1. Create basic blocks for each control path (then, else, merge)
  2. Evaluate condition in entry block
  3. Use conditional branch to select path
  4. Compute branch-specific values
  5. Merge results using PHI node
  6. Continue with merged value

Nested If Expressions

Y Lang supports nested conditionals, requiring additional basic blocks:

#![allow(unused)]
fn main() {
// Y Lang: if x > 0 { if y > 0 { 1 } else { 2 } } else { 3 }

let outer_then_block = context.append_basic_block(function, "outer_then");
let inner_then_block = context.append_basic_block(function, "inner_then");
let inner_else_block = context.append_basic_block(function, "inner_else");
let inner_merge_block = context.append_basic_block(function, "inner_merge");
let outer_else_block = context.append_basic_block(function, "outer_else");
let final_merge_block = context.append_basic_block(function, "final_merge");

// Outer condition
builder.position_at_end(entry_block);
let x = i64_type.const_int(5, false);
let zero = i64_type.const_zero();
let x_gt_0 = builder.build_int_compare(IntPredicate::SGT, x, zero, "x_gt_0").unwrap();
builder.build_conditional_branch(x_gt_0, outer_then_block, outer_else_block).unwrap();

// Outer then: inner conditional
builder.position_at_end(outer_then_block);
let y = i64_type.const_int(-3, false);
let y_gt_0 = builder.build_int_compare(IntPredicate::SGT, y, zero, "y_gt_0").unwrap();
builder.build_conditional_branch(y_gt_0, inner_then_block, inner_else_block).unwrap();

// Inner then
builder.position_at_end(inner_then_block);
let inner_then_val = i64_type.const_int(1, false);
builder.build_unconditional_branch(inner_merge_block).unwrap();

// Inner else
builder.position_at_end(inner_else_block);
let inner_else_val = i64_type.const_int(2, false);
builder.build_unconditional_branch(inner_merge_block).unwrap();

// Inner merge
builder.position_at_end(inner_merge_block);
let inner_phi = builder.build_phi(i64_type, "inner_result").unwrap();
inner_phi.add_incoming(&[
    (&inner_then_val, inner_then_block),
    (&inner_else_val, inner_else_block)
]);
builder.build_unconditional_branch(final_merge_block).unwrap();

// Outer else
builder.position_at_end(outer_else_block);
let outer_else_val = i64_type.const_int(3, false);
builder.build_unconditional_branch(final_merge_block).unwrap();

// Final merge
builder.position_at_end(final_merge_block);
let final_phi = builder.build_phi(i64_type, "final_result").unwrap();
final_phi.add_incoming(&[
    (&inner_phi.as_basic_value(), inner_merge_block),
    (&outer_else_val, outer_else_block)
]);
}

Generated LLVM IR:

define i64 @nested_if() {
entry:
  %x_gt_0 = icmp sgt i64 5, 0
  br i1 %x_gt_0, label %outer_then, label %outer_else

outer_then:
  %y_gt_0 = icmp sgt i64 -3, 0
  br i1 %y_gt_0, label %inner_then, label %inner_else

inner_then:
  br label %inner_merge

inner_else:
  br label %inner_merge

inner_merge:
  %inner_result = phi i64 [ 1, %inner_then ], [ 2, %inner_else ]
  br label %final_merge

outer_else:
  br label %final_merge

final_merge:
  %final_result = phi i64 [ %inner_result, %inner_merge ], [ 3, %outer_else ]
  ret i64 %final_result
}

While Loops

Why loops need PHI nodes: Loop variables change over iterations, requiring PHI nodes to merge values from different loop iterations while maintaining SSA form.

Basic While Loop

#![allow(unused)]
fn main() {
// Y Lang: while i < 10 { i = i + 1; }

let loop_header = context.append_basic_block(function, "loop_header");
let loop_body = context.append_basic_block(function, "loop_body");
let loop_exit = context.append_basic_block(function, "loop_exit");

// Initialize loop variable
builder.position_at_end(entry_block);
let initial_i = i64_type.const_int(0, false);
builder.build_unconditional_branch(loop_header).unwrap();

// Loop header: check condition
builder.position_at_end(loop_header);
let i_phi = builder.build_phi(i64_type, "i").unwrap();
i_phi.add_incoming(&[(&initial_i, entry_block)]);

let ten = i64_type.const_int(10, false);
let condition = builder.build_int_compare(
    IntPredicate::SLT,
    i_phi.as_basic_value().into_int_value(),
    ten,
    "i_lt_10"
).unwrap();

builder.build_conditional_branch(condition, loop_body, loop_exit).unwrap();

// Loop body: increment i
builder.position_at_end(loop_body);
let current_i = i_phi.as_basic_value().into_int_value();
let one = i64_type.const_int(1, false);
let next_i = builder.build_int_add(current_i, one, "next_i").unwrap();

// Add back-edge to PHI
i_phi.add_incoming(&[(&next_i, loop_body)]);
builder.build_unconditional_branch(loop_header).unwrap();

// Loop exit
builder.position_at_end(loop_exit);
builder.build_return(None).unwrap();
}

Generated LLVM IR:

define void @while_loop() {
entry:
  br label %loop_header

loop_header:
  %i = phi i64 [ 0, %entry ], [ %next_i, %loop_body ]
  %i_lt_10 = icmp slt i64 %i, 10
  br i1 %i_lt_10, label %loop_body, label %loop_exit

loop_body:
  %next_i = add i64 %i, 1
  br label %loop_header

loop_exit:
  ret void
}

While Loop with Complex Body

#![allow(unused)]
fn main() {
// Y Lang: while x > 0 { if x % 2 == 0 { x = x / 2 } else { x = x * 3 + 1 } }

let loop_header = context.append_basic_block(function, "loop_header");
let loop_body = context.append_basic_block(function, "loop_body");
let even_branch = context.append_basic_block(function, "even");
let odd_branch = context.append_basic_block(function, "odd");
let body_merge = context.append_basic_block(function, "body_merge");
let loop_exit = context.append_basic_block(function, "loop_exit");

// Initialize
builder.position_at_end(entry_block);
let initial_x = i64_type.const_int(7, false);
builder.build_unconditional_branch(loop_header).unwrap();

// Loop header
builder.position_at_end(loop_header);
let x_phi = builder.build_phi(i64_type, "x").unwrap();
x_phi.add_incoming(&[(&initial_x, entry_block)]);

let zero = i64_type.const_zero();
let x_gt_0 = builder.build_int_compare(
    IntPredicate::SGT,
    x_phi.as_basic_value().into_int_value(),
    zero,
    "x_gt_0"
).unwrap();
builder.build_conditional_branch(x_gt_0, loop_body, loop_exit).unwrap();

// Loop body: check if even
builder.position_at_end(loop_body);
let current_x = x_phi.as_basic_value().into_int_value();
let two = i64_type.const_int(2, false);
let remainder = builder.build_int_signed_rem(current_x, two, "remainder").unwrap();
let is_even = builder.build_int_compare(
    IntPredicate::EQ,
    remainder,
    zero,
    "is_even"
).unwrap();
builder.build_conditional_branch(is_even, even_branch, odd_branch).unwrap();

// Even branch: x = x / 2
builder.position_at_end(even_branch);
let x_div_2 = builder.build_int_signed_div(current_x, two, "x_div_2").unwrap();
builder.build_unconditional_branch(body_merge).unwrap();

// Odd branch: x = x * 3 + 1
builder.position_at_end(odd_branch);
let three = i64_type.const_int(3, false);
let one = i64_type.const_int(1, false);
let x_mul_3 = builder.build_int_mul(current_x, three, "x_mul_3").unwrap();
let x_mul_3_plus_1 = builder.build_int_add(x_mul_3, one, "x_mul_3_plus_1").unwrap();
builder.build_unconditional_branch(body_merge).unwrap();

// Merge body results
builder.position_at_end(body_merge);
let new_x_phi = builder.build_phi(i64_type, "new_x").unwrap();
new_x_phi.add_incoming(&[
    (&x_div_2, even_branch),
    (&x_mul_3_plus_1, odd_branch)
]);

// Add back-edge
x_phi.add_incoming(&[(&new_x_phi.as_basic_value(), body_merge)]);
builder.build_unconditional_branch(loop_header).unwrap();

// Exit
builder.position_at_end(loop_exit);
builder.build_return(None).unwrap();
}

Blocks and Scoping

Why blocks matter: Y Lang blocks create lexical scopes and can return values. LLVM handles this through careful basic block organization and variable lifetime management.

Simple Block Expression

#![allow(unused)]
fn main() {
// Y Lang: { let x = 10; let y = 20; x + y }

let block_entry = context.append_basic_block(function, "block_entry");
let block_exit = context.append_basic_block(function, "block_exit");

builder.position_at_end(entry_block);
builder.build_unconditional_branch(block_entry).unwrap();

// Block body
builder.position_at_end(block_entry);

// let x = 10;
let x_alloca = builder.build_alloca(i64_type, "x").unwrap();
let ten = i64_type.const_int(10, false);
builder.build_store(x_alloca, ten).unwrap();

// let y = 20;
let y_alloca = builder.build_alloca(i64_type, "y").unwrap();
let twenty = i64_type.const_int(20, false);
builder.build_store(y_alloca, twenty).unwrap();

// x + y (block result)
let x_val = builder.build_load(i64_type, x_alloca, "x_val").unwrap();
let y_val = builder.build_load(i64_type, y_alloca, "y_val").unwrap();
let block_result = builder.build_int_add(
    x_val.into_int_value(),
    y_val.into_int_value(),
    "block_result"
).unwrap();

builder.build_unconditional_branch(block_exit).unwrap();

// Block exit: return result
builder.position_at_end(block_exit);
builder.build_return(Some(&block_result)).unwrap();
}

Generated LLVM IR:

define i64 @block_expression() {
entry:
  br label %block_entry

block_entry:
  %x = alloca i64
  store i64 10, ptr %x
  %y = alloca i64
  store i64 20, ptr %y
  %x_val = load i64, ptr %x
  %y_val = load i64, ptr %y
  %block_result = add i64 %x_val, %y_val
  br label %block_exit

block_exit:
  ret i64 %block_result
}

Nested Blocks with Shadowing

#![allow(unused)]
fn main() {
// Y Lang: { let x = 1; { let x = 2; x } }

let outer_block = context.append_basic_block(function, "outer_block");
let inner_block = context.append_basic_block(function, "inner_block");
let inner_exit = context.append_basic_block(function, "inner_exit");
let outer_exit = context.append_basic_block(function, "outer_exit");

builder.position_at_end(entry_block);
builder.build_unconditional_branch(outer_block).unwrap();

// Outer block
builder.position_at_end(outer_block);
let outer_x_alloca = builder.build_alloca(i64_type, "outer_x").unwrap();
let one = i64_type.const_int(1, false);
builder.build_store(outer_x_alloca, one).unwrap();

builder.build_unconditional_branch(inner_block).unwrap();

// Inner block (shadows outer x)
builder.position_at_end(inner_block);
let inner_x_alloca = builder.build_alloca(i64_type, "inner_x").unwrap();
let two = i64_type.const_int(2, false);
builder.build_store(inner_x_alloca, two).unwrap();

// Inner block result: inner x
let inner_x_val = builder.build_load(i64_type, inner_x_alloca, "inner_x_val").unwrap();
builder.build_unconditional_branch(inner_exit).unwrap();

// Inner exit
builder.position_at_end(inner_exit);
builder.build_unconditional_branch(outer_exit).unwrap();

// Outer exit: return inner block result
builder.position_at_end(outer_exit);
builder.build_return(Some(&inner_x_val)).unwrap();
}

Generated LLVM IR:

define i64 @nested_blocks() {
entry:
  br label %outer_block

outer_block:
  %outer_x = alloca i64
  store i64 1, ptr %outer_x
  br label %inner_block

inner_block:
  %inner_x = alloca i64
  store i64 2, ptr %inner_x
  %inner_x_val = load i64, ptr %inner_x
  br label %inner_exit

inner_exit:
  br label %outer_exit

outer_exit:
  ret i64 %inner_x_val
}

Advanced Control Flow Patterns

Early Return from Blocks

Y Lang allows early returns from nested contexts:

#![allow(unused)]
fn main() {
// Y Lang: { if condition { return 42; } other_computation() }

let block_start = context.append_basic_block(function, "block_start");
let check_condition = context.append_basic_block(function, "check_condition");
let early_return = context.append_basic_block(function, "early_return");
let continue_block = context.append_basic_block(function, "continue_block");
let block_end = context.append_basic_block(function, "block_end");

builder.position_at_end(entry_block);
builder.build_unconditional_branch(block_start).unwrap();

builder.position_at_end(block_start);
builder.build_unconditional_branch(check_condition).unwrap();

// Check condition for early return
builder.position_at_end(check_condition);
let condition = bool_type.const_int(1, false); // true for example
builder.build_conditional_branch(condition, early_return, continue_block).unwrap();

// Early return path
builder.position_at_end(early_return);
let early_value = i64_type.const_int(42, false);
builder.build_return(Some(&early_value)).unwrap();

// Continue with normal computation
builder.position_at_end(continue_block);
let other_result = i64_type.const_int(100, false);
builder.build_unconditional_branch(block_end).unwrap();

// Block end
builder.position_at_end(block_end);
builder.build_return(Some(&other_result)).unwrap();
}

Break and Continue (for future loop constructs)

Pattern for implementing break/continue in loops:

#![allow(unused)]
fn main() {
// Y Lang: while condition { if should_break { break; } if should_continue { continue; } body; }

let loop_header = context.append_basic_block(function, "loop_header");
let loop_body = context.append_basic_block(function, "loop_body");
let check_break = context.append_basic_block(function, "check_break");
let check_continue = context.append_basic_block(function, "check_continue");
let loop_body_end = context.append_basic_block(function, "loop_body_end");
let loop_exit = context.append_basic_block(function, "loop_exit");

// Loop header with condition check
builder.position_at_end(loop_header);
let condition = bool_type.const_int(1, false); // Placeholder condition
builder.build_conditional_branch(condition, loop_body, loop_exit).unwrap();

// Loop body start
builder.position_at_end(loop_body);
builder.build_unconditional_branch(check_break).unwrap();

// Check for break
builder.position_at_end(check_break);
let should_break = bool_type.const_int(0, false); // false for example
builder.build_conditional_branch(should_break, loop_exit, check_continue).unwrap();

// Check for continue
builder.position_at_end(check_continue);
let should_continue = bool_type.const_int(0, false); // false for example
builder.build_conditional_branch(should_continue, loop_header, loop_body_end).unwrap();

// Rest of loop body
builder.position_at_end(loop_body_end);
// ... other loop body operations ...
builder.build_unconditional_branch(loop_header).unwrap();

// Loop exit
builder.position_at_end(loop_exit);
builder.build_return(None).unwrap();
}

Control Flow with Variables

Loop Variables and Mutation

#![allow(unused)]
fn main() {
// Y Lang: let mut sum = 0; let mut i = 1; while i <= 10 { sum = sum + i; i = i + 1; } sum

let loop_header = context.append_basic_block(function, "loop_header");
let loop_body = context.append_basic_block(function, "loop_body");
let loop_exit = context.append_basic_block(function, "loop_exit");

// Initialize variables
builder.position_at_end(entry_block);
let sum_alloca = builder.build_alloca(i64_type, "sum").unwrap();
let i_alloca = builder.build_alloca(i64_type, "i").unwrap();

let zero = i64_type.const_zero();
let one = i64_type.const_int(1, false);
builder.build_store(sum_alloca, zero).unwrap();
builder.build_store(i_alloca, one).unwrap();

builder.build_unconditional_branch(loop_header).unwrap();

// Loop condition: i <= 10
builder.position_at_end(loop_header);
let current_i = builder.build_load(i64_type, i_alloca, "current_i").unwrap();
let ten = i64_type.const_int(10, false);
let i_le_10 = builder.build_int_compare(
    IntPredicate::SLE,
    current_i.into_int_value(),
    ten,
    "i_le_10"
).unwrap();
builder.build_conditional_branch(i_le_10, loop_body, loop_exit).unwrap();

// Loop body: sum = sum + i; i = i + 1;
builder.position_at_end(loop_body);
let current_sum = builder.build_load(i64_type, sum_alloca, "current_sum").unwrap();
let current_i_body = builder.build_load(i64_type, i_alloca, "current_i_body").unwrap();

// sum = sum + i
let new_sum = builder.build_int_add(
    current_sum.into_int_value(),
    current_i_body.into_int_value(),
    "new_sum"
).unwrap();
builder.build_store(sum_alloca, new_sum).unwrap();

// i = i + 1
let new_i = builder.build_int_add(
    current_i_body.into_int_value(),
    one,
    "new_i"
).unwrap();
builder.build_store(i_alloca, new_i).unwrap();

builder.build_unconditional_branch(loop_header).unwrap();

// Loop exit: return sum
builder.position_at_end(loop_exit);
let final_sum = builder.build_load(i64_type, sum_alloca, "final_sum").unwrap();
builder.build_return(Some(&final_sum)).unwrap();
}

Generated LLVM IR:

define i64 @sum_loop() {
entry:
  %sum = alloca i64
  %i = alloca i64
  store i64 0, ptr %sum
  store i64 1, ptr %i
  br label %loop_header

loop_header:
  %current_i = load i64, ptr %i
  %i_le_10 = icmp sle i64 %current_i, 10
  br i1 %i_le_10, label %loop_body, label %loop_exit

loop_body:
  %current_sum = load i64, ptr %sum
  %current_i_body = load i64, ptr %i
  %new_sum = add i64 %current_sum, %current_i_body
  store i64 %new_sum, ptr %sum
  %new_i = add i64 %current_i_body, 1
  store i64 %new_i, ptr %i
  br label %loop_header

loop_exit:
  %final_sum = load i64, ptr %sum
  ret i64 %final_sum
}

Error Handling in Control Flow

Safe Condition Evaluation

#![allow(unused)]
fn main() {
fn safe_conditional_branch<'ctx>(
    builder: &Builder<'ctx>,
    condition: IntValue<'ctx>,
    then_block: BasicBlock<'ctx>,
    else_block: BasicBlock<'ctx>
) -> Result<(), String> {
    if condition.get_type().get_bit_width() != 1 {
        return Err(format!(
            "Condition must be i1, got i{}",
            condition.get_type().get_bit_width()
        ));
    }

    builder.build_conditional_branch(condition, then_block, else_block)
        .map_err(|e| format!("Failed to build conditional branch: {}", e))?;

    Ok(())
}
}

PHI Node Validation

#![allow(unused)]
fn main() {
fn validate_phi_node<'ctx>(
    phi: PhiValue<'ctx>,
    expected_type: BasicTypeEnum<'ctx>
) -> Result<(), String> {
    if phi.as_basic_value().get_type() != expected_type {
        return Err(format!(
            "PHI type mismatch: expected {:?}, got {:?}",
            expected_type,
            phi.as_basic_value().get_type()
        ));
    }

    if phi.count_incoming() == 0 {
        return Err("PHI node has no incoming values".to_string());
    }

    Ok(())
}
}

Optimization Considerations

Minimizing Basic Blocks

#![allow(unused)]
fn main() {
// Prefer this: direct value computation when possible
let condition = bool_type.const_int(1, false);
let result = builder.build_select(
    condition,
    i64_type.const_int(42, false),
    i64_type.const_int(0, false),
    "conditional_result"
).unwrap();

// Over this: creating basic blocks for simple conditionals
// (Only use basic blocks when necessary for complex control flow)
}

Loop Optimization Hints

#![allow(unused)]
fn main() {
// Mark loop headers for optimization
use inkwell::attributes::{Attribute, AttributeLoc};

let loop_header = context.append_basic_block(function, "loop_header");
// LLVM can automatically detect loops, but explicit marking helps
}

Dead Code Elimination

#![allow(unused)]
fn main() {
// Ensure all basic blocks are reachable
fn validate_cfg<'ctx>(function: FunctionValue<'ctx>) -> Result<(), String> {
    for block in function.get_basic_blocks() {
        if block.get_terminator().is_none() {
            return Err(format!(
                "Basic block '{}' has no terminator",
                block.get_name().to_string_lossy()
            ));
        }
    }
    Ok(())
}
}

This comprehensive coverage of control flow provides the foundation for implementing Y Lang's conditional expressions, loops, and blocks in LLVM, emphasizing the proper use of basic blocks, PHI nodes, and SSA form maintenance.

Data Structures

This section covers implementing Y Lang's composite data types using Inkwell, including arrays, structs, tuples, and their memory layout and access patterns.

Arrays

Why arrays need careful layout: LLVM arrays are contiguous memory blocks with compile-time known sizes, enabling efficient indexing and bounds checking while maintaining memory safety.

Array Declaration and Initialization

Y Lang arrays map to LLVM array types with explicit element types and sizes:

#![allow(unused)]
fn main() {
use inkwell::context::Context;

let context = Context::create();
let module = context.create_module("arrays");
let builder = context.create_builder();

let i64_type = context.i64_type();
let array_type = i64_type.array_type(5); // [i64; 5]

// Function context for arrays
let fn_type = context.void_type().fn_type(&[], false);
let function = module.add_function("test_arrays", fn_type, None);
let entry_block = context.append_basic_block(function, "entry");
builder.position_at_end(entry_block);

// Declare array: let arr: [i64; 5];
let array_alloca = builder.build_alloca(array_type, "arr").unwrap();
}

Generated LLVM IR:

define void @test_arrays() {
entry:
  %arr = alloca [5 x i64]
  ret void
}

Implementation steps:

  1. Determine element type and array size at compile time
  2. Create LLVM array type with array_type(size)
  3. Allocate stack space with build_alloca
  4. Handle initialization through element-by-element stores or constant arrays

Array Initialization with Constants

#![allow(unused)]
fn main() {
// Initialize with constant values: let arr = [1, 2, 3, 4, 5];
let values = [1, 2, 3, 4, 5].map(|v| i64_type.const_int(v, false));
let array_constant = i64_type.const_array(&values);

let array_alloca = builder.build_alloca(array_type, "arr").unwrap();
builder.build_store(array_alloca, array_constant).unwrap();
}

Generated LLVM IR:

%arr = alloca [5 x i64]
store [5 x i64] [i64 1, i64 2, i64 3, i64 4, i64 5], ptr %arr

Array Element Access

Array indexing requires calculating element addresses with GEP (GetElementPtr):

#![allow(unused)]
fn main() {
// Access element: arr[2]
let zero = i64_type.const_int(0, false);      // Array base offset
let index = i64_type.const_int(2, false);     // Element index

let element_ptr = unsafe {
    builder.build_gep(array_type, array_alloca, &[zero, index], "elem_ptr").unwrap()
};

// Load element value
let element_value = builder.build_load(i64_type, element_ptr, "element").unwrap();
}

Generated LLVM IR:

%elem_ptr = getelementptr [5 x i64], ptr %arr, i64 0, i64 2
%element = load i64, ptr %elem_ptr

GEP indexing explanation:

  • First index (0): Navigate through the array allocation pointer
  • Second index (2): Select the 3rd element (0-based indexing)
  • Result: Pointer to the specific array element

Array Element Assignment

#![allow(unused)]
fn main() {
// Assignment: arr[1] = 42;
let one_index = i64_type.const_int(1, false);
let new_value = i64_type.const_int(42, false);

let target_ptr = unsafe {
    builder.build_gep(array_type, array_alloca, &[zero, one_index], "target_ptr").unwrap()
};

builder.build_store(target_ptr, new_value).unwrap();
}

Generated LLVM IR:

%target_ptr = getelementptr [5 x i64], ptr %arr, i64 0, i64 1
store i64 42, ptr %target_ptr

Dynamic Array Indexing

For runtime-computed indices, bounds checking becomes important:

#![allow(unused)]
fn main() {
// arr[runtime_index]
let runtime_index = builder.build_load(i64_type, index_var, "runtime_idx").unwrap().into_int_value();

// Bounds check (optional but recommended)
let array_len = i64_type.const_int(5, false);
let in_bounds = builder.build_int_compare(
    IntPredicate::ULT,
    runtime_index,
    array_len,
    "in_bounds"
).unwrap();

// For safety, use conditional access or trap on out-of-bounds
let safe_index = builder.build_select(
    in_bounds,
    runtime_index,
    zero, // Default to index 0 if out of bounds
    "safe_index"
).unwrap();

let elem_ptr = unsafe {
    builder.build_gep(array_type, array_alloca, &[zero, safe_index.into_int_value()], "dyn_ptr").unwrap()
};
}

Generated LLVM IR:

%runtime_idx = load i64, ptr %index_var
%in_bounds = icmp ult i64 %runtime_idx, 5
%safe_index = select i1 %in_bounds, i64 %runtime_idx, i64 0
%dyn_ptr = getelementptr [5 x i64], ptr %arr, i64 0, i64 %safe_index

Structs

Why structs need structured layout: LLVM struct types enable efficient field access, proper alignment, and type safety for composite data, supporting both performance and correctness.

Struct Type Definition

Y Lang structs map to LLVM struct types with named or anonymous fields:

#![allow(unused)]
fn main() {
// Y Lang: struct Point { x: i64, y: i64 }
let field_types = vec![i64_type.into(), i64_type.into()];
let point_type = context.struct_type(&field_types, false); // false = not packed
}

Generated LLVM IR:

%Point = type { i64, i64 }

Struct Variable Declaration and Initialization

#![allow(unused)]
fn main() {
// let point = Point { x: 10, y: 20 };
let point_alloca = builder.build_alloca(point_type, "point").unwrap();

// Initialize fields individually
let x_ptr = builder.build_struct_gep(point_type, point_alloca, 0, "x_ptr").unwrap();
let y_ptr = builder.build_struct_gep(point_type, point_alloca, 1, "y_ptr").unwrap();

let x_value = i64_type.const_int(10, false);
let y_value = i64_type.const_int(20, false);

builder.build_store(x_ptr, x_value).unwrap();
builder.build_store(y_ptr, y_value).unwrap();
}

Generated LLVM IR:

%point = alloca { i64, i64 }
%x_ptr = getelementptr { i64, i64 }, ptr %point, i32 0, i32 0
%y_ptr = getelementptr { i64, i64 }, ptr %point, i32 0, i32 1
store i64 10, ptr %x_ptr
store i64 20, ptr %y_ptr

Struct Constant Initialization

For compile-time known values, use struct constants:

#![allow(unused)]
fn main() {
// Efficient constant initialization
let field_values = vec![x_value.into(), y_value.into()];
let struct_constant = point_type.const_named_struct(&field_values);

let point_alloca = builder.build_alloca(point_type, "point").unwrap();
builder.build_store(point_alloca, struct_constant).unwrap();
}

Generated LLVM IR:

%point = alloca { i64, i64 }
store { i64, i64 } { i64 10, i64 20 }, ptr %point

Struct Field Access

#![allow(unused)]
fn main() {
// Access field: point.x
let x_ptr = builder.build_struct_gep(point_type, point_alloca, 0, "x_field").unwrap();
let x_value = builder.build_load(i64_type, x_ptr, "x_val").unwrap();

// Access field: point.y
let y_ptr = builder.build_struct_gep(point_type, point_alloca, 1, "y_field").unwrap();
let y_value = builder.build_load(i64_type, y_ptr, "y_val").unwrap();
}

Generated LLVM IR:

%x_field = getelementptr { i64, i64 }, ptr %point, i32 0, i32 0
%x_val = load i64, ptr %x_field
%y_field = getelementptr { i64, i64 }, ptr %point, i32 0, i32 1
%y_val = load i64, ptr %y_field

Struct Field Assignment

#![allow(unused)]
fn main() {
// Modify field: point.x = 42;
let x_ptr = builder.build_struct_gep(point_type, point_alloca, 0, "x_field").unwrap();
let new_x = i64_type.const_int(42, false);
builder.build_store(x_ptr, new_x).unwrap();
}

Generated LLVM IR:

%x_field = getelementptr { i64, i64 }, ptr %point, i32 0, i32 0
store i64 42, ptr %x_field

Nested Structs

Structs containing other structs require careful GEP indexing:

#![allow(unused)]
fn main() {
// Y Lang: struct Rectangle { top_left: Point, bottom_right: Point }
let rectangle_type = context.struct_type(&[
    point_type.into(),  // top_left field
    point_type.into(),  // bottom_right field
], false);

let rect_alloca = builder.build_alloca(rectangle_type, "rect").unwrap();

// Access nested field: rect.top_left.x
let top_left_ptr = builder.build_struct_gep(rectangle_type, rect_alloca, 0, "top_left").unwrap();
let x_ptr = builder.build_struct_gep(point_type, top_left_ptr, 0, "x_ptr").unwrap();
let x_value = builder.build_load(i64_type, x_ptr, "x_val").unwrap();
}

Generated LLVM IR:

%Rectangle = type { { i64, i64 }, { i64, i64 } }
%rect = alloca { { i64, i64 }, { i64, i64 } }
%top_left = getelementptr { { i64, i64 }, { i64, i64 } }, ptr %rect, i32 0, i32 0
%x_ptr = getelementptr { i64, i64 }, ptr %top_left, i32 0, i32 0
%x_val = load i64, ptr %x_ptr

Tuples

Why tuples are like anonymous structs: LLVM treats tuples as struct types without named fields, enabling efficient packing of heterogeneous data with positional access.

Tuple Type Definition and Creation

#![allow(unused)]
fn main() {
// Y Lang: (i64, f64, bool)
let f64_type = context.f64_type();
let bool_type = context.bool_type();

let tuple_type = context.struct_type(&[
    i64_type.into(),
    f64_type.into(),
    bool_type.into(),
], false);

// Create tuple: (42, 3.14, true)
let tuple_alloca = builder.build_alloca(tuple_type, "tuple").unwrap();

// Initialize elements
let elem0_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 0, "elem0").unwrap();
let elem1_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 1, "elem1").unwrap();
let elem2_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 2, "elem2").unwrap();

builder.build_store(elem0_ptr, i64_type.const_int(42, false)).unwrap();
builder.build_store(elem1_ptr, f64_type.const_float(3.14)).unwrap();
builder.build_store(elem2_ptr, bool_type.const_int(1, false)).unwrap();
}

Generated LLVM IR:

%tuple = alloca { i64, double, i1 }
%elem0 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 0
%elem1 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 1
%elem2 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 2
store i64 42, ptr %elem0
store double 3.14, ptr %elem1
store i1 true, ptr %elem2

Tuple Element Access

#![allow(unused)]
fn main() {
// Access tuple elements: tuple.0, tuple.1, tuple.2
let elem0_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 0, "get_0").unwrap();
let elem0_val = builder.build_load(i64_type, elem0_ptr, "val_0").unwrap();

let elem1_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 1, "get_1").unwrap();
let elem1_val = builder.build_load(f64_type, elem1_ptr, "val_1").unwrap();

let elem2_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 2, "get_2").unwrap();
let elem2_val = builder.build_load(bool_type, elem2_ptr, "val_2").unwrap();
}

Generated LLVM IR:

%get_0 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 0
%val_0 = load i64, ptr %get_0
%get_1 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 1
%val_1 = load double, ptr %get_1
%get_2 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 2
%val_2 = load i1, ptr %get_2

Tuple Destructuring

Y Lang tuple destructuring can be implemented through multiple GEP operations:

#![allow(unused)]
fn main() {
// Y Lang: let (x, y, flag) = tuple;
let x_alloca = builder.build_alloca(i64_type, "x").unwrap();
let y_alloca = builder.build_alloca(f64_type, "y").unwrap();
let flag_alloca = builder.build_alloca(bool_type, "flag").unwrap();

// Extract and store each element
let elem0_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 0, "extract_0").unwrap();
let elem0_val = builder.build_load(i64_type, elem0_ptr, "x_val").unwrap();
builder.build_store(x_alloca, elem0_val).unwrap();

let elem1_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 1, "extract_1").unwrap();
let elem1_val = builder.build_load(f64_type, elem1_ptr, "y_val").unwrap();
builder.build_store(y_alloca, elem1_val).unwrap();

let elem2_ptr = builder.build_struct_gep(tuple_type, tuple_alloca, 2, "extract_2").unwrap();
let elem2_val = builder.build_load(bool_type, elem2_ptr, "flag_val").unwrap();
builder.build_store(flag_alloca, elem2_val).unwrap();
}

Generated LLVM IR:

%x = alloca i64
%y = alloca double
%flag = alloca i1
%extract_0 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 0
%x_val = load i64, ptr %extract_0
store i64 %x_val, ptr %x
%extract_1 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 1
%y_val = load double, ptr %extract_1
store double %y_val, ptr %y
%extract_2 = getelementptr { i64, double, i1 }, ptr %tuple, i32 0, i32 2
%flag_val = load i1, ptr %extract_2
store i1 %flag_val, ptr %flag

Memory Layout and Alignment

Why layout matters: Understanding memory layout enables performance optimization and proper alignment for different architectures.

Struct Padding and Alignment

#![allow(unused)]
fn main() {
// Struct with different-sized fields
let mixed_struct_type = context.struct_type(&[
    context.i8_type().into(),   // 1 byte
    i64_type.into(),            // 8 bytes
    context.i16_type().into(),  // 2 bytes
], false); // Natural alignment

// Packed struct (no padding)
let packed_struct_type = context.struct_type(&[
    context.i8_type().into(),
    i64_type.into(),
    context.i16_type().into(),
], true); // Packed alignment
}

Generated LLVM IR:

; Natural alignment (with padding)
%MixedStruct = type { i8, i64, i16 }  ; Likely 24 bytes with padding

; Packed alignment (no padding)
%PackedStruct = type <{ i8, i64, i16 }>  ; Exactly 11 bytes

Array of Structs

Combining arrays and structs for complex data layouts:

#![allow(unused)]
fn main() {
// Array of points: [Point; 3]
let point_array_type = point_type.array_type(3);
let points_alloca = builder.build_alloca(point_array_type, "points").unwrap();

// Access specific point: points[1].x
let zero = i64_type.const_int(0, false);
let index_1 = i64_type.const_int(1, false);

let point_ptr = unsafe {
    builder.build_gep(point_array_type, points_alloca, &[zero, index_1], "point_1").unwrap()
};

let x_ptr = builder.build_struct_gep(point_type, point_ptr, 0, "point_1_x").unwrap();
let x_value = builder.build_load(i64_type, x_ptr, "x_val").unwrap();
}

Generated LLVM IR:

%points = alloca [3 x { i64, i64 }]
%point_1 = getelementptr [3 x { i64, i64 }], ptr %points, i64 0, i64 1
%point_1_x = getelementptr { i64, i64 }, ptr %point_1, i32 0, i32 0
%x_val = load i64, ptr %point_1_x

Advanced Data Structure Patterns

Generic-Like Structs

Using LLVM's type system to simulate generics:

#![allow(unused)]
fn main() {
// Different instantiations of "generic" container
fn create_container_type<'ctx>(context: &'ctx Context, element_type: BasicTypeEnum<'ctx>) -> StructType<'ctx> {
    context.struct_type(&[
        element_type,                           // data
        context.i64_type().into(),             // size
        context.i64_type().into(),             // capacity
    ], false)
}

let int_container = create_container_type(&context, i64_type.into());
let float_container = create_container_type(&context, f64_type.into());
}

Optional Types (Sum Types)

Implementing Option using tagged unions:

#![allow(unused)]
fn main() {
// Option<i64> as tagged union
let option_type = context.struct_type(&[
    context.i8_type().into(),   // Tag: 0 = None, 1 = Some
    i64_type.into(),            // Value (only valid if tag == 1)
], false);

// Create Some(42)
let some_42 = builder.build_alloca(option_type, "some_42").unwrap();

let tag_ptr = builder.build_struct_gep(option_type, some_42, 0, "tag_ptr").unwrap();
let val_ptr = builder.build_struct_gep(option_type, some_42, 1, "val_ptr").unwrap();

builder.build_store(tag_ptr, context.i8_type().const_int(1, false)).unwrap(); // Some
builder.build_store(val_ptr, i64_type.const_int(42, false)).unwrap();
}

Generated LLVM IR:

%Option_i64 = type { i8, i64 }
%some_42 = alloca { i8, i64 }
%tag_ptr = getelementptr { i8, i64 }, ptr %some_42, i32 0, i32 0
%val_ptr = getelementptr { i8, i64 }, ptr %some_42, i32 0, i32 1
store i8 1, ptr %tag_ptr
store i64 42, ptr %val_ptr

Dynamic Arrays (Vectors)

Implementing growable arrays with heap allocation:

#![allow(unused)]
fn main() {
// Vector representation: { ptr, length, capacity }
let ptr_type = context.ptr_type(Default::default());
let vector_type = context.struct_type(&[
    ptr_type.into(),        // data pointer
    i64_type.into(),        // length
    i64_type.into(),        // capacity
], false);

let vec_alloca = builder.build_alloca(vector_type, "vector").unwrap();

// Initialize empty vector
let null_ptr = ptr_type.const_null();
let zero_len = i64_type.const_zero();
let zero_cap = i64_type.const_zero();

let data_ptr = builder.build_struct_gep(vector_type, vec_alloca, 0, "data_field").unwrap();
let len_ptr = builder.build_struct_gep(vector_type, vec_alloca, 1, "len_field").unwrap();
let cap_ptr = builder.build_struct_gep(vector_type, vec_alloca, 2, "cap_field").unwrap();

builder.build_store(data_ptr, null_ptr).unwrap();
builder.build_store(len_ptr, zero_len).unwrap();
builder.build_store(cap_ptr, zero_cap).unwrap();
}

Generated LLVM IR:

%Vector = type { ptr, i64, i64 }
%vector = alloca { ptr, i64, i64 }
%data_field = getelementptr { ptr, i64, i64 }, ptr %vector, i32 0, i32 0
%len_field = getelementptr { ptr, i64, i64 }, ptr %vector, i32 0, i32 1
%cap_field = getelementptr { ptr, i64, i64 }, ptr %vector, i32 0, i32 2
store ptr null, ptr %data_field
store i64 0, ptr %len_field
store i64 0, ptr %cap_field

Error Handling and Validation

Bounds Checking for Data Structures

#![allow(unused)]
fn main() {
fn safe_array_access<'ctx>(
    builder: &Builder<'ctx>,
    array_ptr: PointerValue<'ctx>,
    array_type: ArrayType<'ctx>,
    index: IntValue<'ctx>,
    element_type: BasicTypeEnum<'ctx>
) -> Result<BasicValueEnum<'ctx>, String> {
    let array_len = array_type.len() as u64;
    let len_const = element_type.into_int_type().const_int(array_len, false);

    // Runtime bounds check
    let in_bounds = builder.build_int_compare(
        IntPredicate::ULT,
        index,
        len_const,
        "bounds_check"
    ).map_err(|e| format!("Failed bounds check: {}", e))?;

    // Could add trap or error handling here
    let zero = element_type.into_int_type().const_zero();
    let elem_ptr = unsafe {
        builder.build_gep(array_type, array_ptr, &[zero, index], "safe_elem")
            .map_err(|e| format!("GEP failed: {}", e))?
    };

    builder.build_load(element_type, elem_ptr, "safe_load")
        .map_err(|e| format!("Load failed: {}", e))
}
}

Type Safety for Struct Fields

#![allow(unused)]
fn main() {
fn safe_struct_field_access<'ctx>(
    builder: &Builder<'ctx>,
    struct_ptr: PointerValue<'ctx>,
    struct_type: StructType<'ctx>,
    field_index: u32,
    expected_type: BasicTypeEnum<'ctx>
) -> Result<BasicValueEnum<'ctx>, String> {
    let field_types = struct_type.get_field_types();

    if field_index as usize >= field_types.len() {
        return Err(format!("Field index {} out of bounds", field_index));
    }

    let actual_type = field_types[field_index as usize];
    if actual_type != expected_type {
        return Err(format!("Type mismatch: expected {:?}, got {:?}", expected_type, actual_type));
    }

    let field_ptr = builder.build_struct_gep(struct_type, struct_ptr, field_index, "field")
        .map_err(|e| format!("Struct GEP failed: {}", e))?;

    builder.build_load(expected_type, field_ptr, "field_val")
        .map_err(|e| format!("Field load failed: {}", e))
}
}

Performance Optimization Strategies

Minimizing Memory Operations

#![allow(unused)]
fn main() {
// Efficient: Load struct once, extract fields as needed
let struct_val = builder.build_load(point_type, point_alloca, "point_val").unwrap();
let x_val = builder.build_extract_value(struct_val.into_struct_value(), 0, "x").unwrap();
let y_val = builder.build_extract_value(struct_val.into_struct_value(), 1, "y").unwrap();

// Less efficient: Multiple GEP + load operations
let x_ptr = builder.build_struct_gep(point_type, point_alloca, 0, "x_ptr").unwrap();
let x_val_slow = builder.build_load(i64_type, x_ptr, "x_slow").unwrap();
}

Prefer Stack Allocation When Possible

#![allow(unused)]
fn main() {
// Good: Stack allocation for known-size data
let local_array = builder.build_alloca(array_type, "local").unwrap();

// Only use heap allocation when necessary (dynamic size, large data, etc.)
}

Leverage LLVM's Optimization Passes

#![allow(unused)]
fn main() {
// LLVM can optimize away unnecessary loads/stores and GEP chains
// Structure your code to enable these optimizations:
// 1. Use consistent naming
// 2. Avoid redundant memory operations
// 3. Let LLVM handle layout optimization
}

This comprehensive coverage of data structures provides the foundation for implementing Y Lang's composite types in LLVM, emphasizing proper memory layout, type safety, and performance considerations for arrays, structs, tuples, and advanced patterns.

Advanced Constructs

This section covers implementing Y Lang's advanced language constructs using Inkwell, including lambda expressions, closures, method calls, and pattern matching patterns that require sophisticated LLVM IR generation.

Lambda Expressions and Function Values

Why lambdas need special handling: Lambda expressions create anonymous functions that can capture variables from their environment, requiring careful management of function pointers, closure environments, and calling conventions.

Basic Lambda Expression

Y Lang lambdas are first-class values that can be passed around and called:

#![allow(unused)]
fn main() {
use inkwell::context::Context;
use inkwell::types::BasicMetadataTypeEnum;

let context = Context::create();
let module = context.create_module("lambdas");
let builder = context.create_builder();

let i64_type = context.i64_type();
let ptr_type = context.ptr_type(Default::default());

// Lambda: |x: i64| -> i64 { x + 1 }
// Step 1: Create the lambda function
let lambda_param_types = vec![BasicMetadataTypeEnum::IntType(i64_type)];
let lambda_fn_type = i64_type.fn_type(&lambda_param_types, false);
let lambda_function = module.add_function("lambda_0", lambda_fn_type, None);

// Step 2: Implement lambda body
let lambda_entry = context.append_basic_block(lambda_function, "entry");
builder.position_at_end(lambda_entry);

let x_param = lambda_function.get_nth_param(0).unwrap().into_int_value();
let one = i64_type.const_int(1, false);
let result = builder.build_int_add(x_param, one, "add_one").unwrap();
builder.build_return(Some(&result)).unwrap();

// Step 3: Create function pointer value
let lambda_ptr = lambda_function.as_global_value().as_pointer_value();
}

Generated LLVM IR:

define i64 @lambda_0(i64 %0) {
entry:
  %add_one = add i64 %0, 1
  ret i64 %add_one
}

; Usage would involve function pointer: @lambda_0

Lambda with Variable Capture (Closures)

Closures capture variables from their enclosing scope, requiring environment structures:

#![allow(unused)]
fn main() {
// Y Lang: let y = 10; let closure = |x| x + y;
// This requires creating a closure environment

// Step 1: Define closure environment structure
let closure_env_type = context.struct_type(&[
    i64_type.into(), // captured variable 'y'
], false);

// Step 2: Create closure function that takes environment + parameters
let closure_param_types = vec![
    BasicMetadataTypeEnum::PointerType(closure_env_type.ptr_type(Default::default())), // env
    BasicMetadataTypeEnum::IntType(i64_type), // x parameter
];
let closure_fn_type = i64_type.fn_type(&closure_param_types, false);
let closure_function = module.add_function("closure_0", closure_fn_type, None);

// Step 3: Implement closure body
let closure_entry = context.append_basic_block(closure_function, "entry");
builder.position_at_end(closure_entry);

let env_param = closure_function.get_nth_param(0).unwrap().into_pointer_value();
let x_param = closure_function.get_nth_param(1).unwrap().into_int_value();

// Extract captured variable from environment
let y_ptr = builder.build_struct_gep(closure_env_type, env_param, 0, "y_ptr").unwrap();
let y_val = builder.build_load(i64_type, y_ptr, "y_val").unwrap().into_int_value();

// Compute x + y
let closure_result = builder.build_int_add(x_param, y_val, "x_plus_y").unwrap();
builder.build_return(Some(&closure_result)).unwrap();

// Step 4: Create closure environment at runtime
let y_value = i64_type.const_int(10, false);
let env_alloca = builder.build_alloca(closure_env_type, "closure_env").unwrap();
let y_field_ptr = builder.build_struct_gep(closure_env_type, env_alloca, 0, "y_field").unwrap();
builder.build_store(y_field_ptr, y_value).unwrap();

// Step 5: Create closure representation (function pointer + environment)
let closure_type = context.struct_type(&[
    closure_fn_type.ptr_type(Default::default()).into(), // function pointer
    closure_env_type.ptr_type(Default::default()).into(), // environment pointer
], false);

let closure_alloca = builder.build_alloca(closure_type, "closure").unwrap();
let fn_ptr_field = builder.build_struct_gep(closure_type, closure_alloca, 0, "fn_ptr_field").unwrap();
let env_ptr_field = builder.build_struct_gep(closure_type, closure_alloca, 1, "env_ptr_field").unwrap();

let closure_fn_ptr = closure_function.as_global_value().as_pointer_value();
builder.build_store(fn_ptr_field, closure_fn_ptr).unwrap();
builder.build_store(env_ptr_field, env_alloca).unwrap();
}

Generated LLVM IR:

%ClosureEnv = type { i64 }
%Closure = type { ptr, ptr }

define i64 @closure_0(ptr %0, i64 %1) {
entry:
  %y_ptr = getelementptr %ClosureEnv, ptr %0, i32 0, i32 0
  %y_val = load i64, ptr %y_ptr
  %x_plus_y = add i64 %1, %y_val
  ret i64 %x_plus_y
}

; Environment creation:
%closure_env = alloca %ClosureEnv
%y_field = getelementptr %ClosureEnv, ptr %closure_env, i32 0, i32 0
store i64 10, ptr %y_field

; Closure creation:
%closure = alloca %Closure
%fn_ptr_field = getelementptr %Closure, ptr %closure, i32 0, i32 0
%env_ptr_field = getelementptr %Closure, ptr %closure, i32 0, i32 1
store ptr @closure_0, ptr %fn_ptr_field
store ptr %closure_env, ptr %env_ptr_field

Calling Closures

Closures are called by extracting their function pointer and environment:

#![allow(unused)]
fn main() {
// Call closure: closure(42)
let arg_value = i64_type.const_int(42, false);

// Extract function pointer and environment
let fn_ptr_ptr = builder.build_struct_gep(closure_type, closure_alloca, 0, "fn_ptr_ptr").unwrap();
let env_ptr_ptr = builder.build_struct_gep(closure_type, closure_alloca, 1, "env_ptr_ptr").unwrap();

let fn_ptr = builder.build_load(closure_fn_type.ptr_type(Default::default()), fn_ptr_ptr, "fn_ptr").unwrap().into_pointer_value();
let env_ptr = builder.build_load(closure_env_type.ptr_type(Default::default()), env_ptr_ptr, "env_ptr").unwrap().into_pointer_value();

// Call with environment and arguments
let call_args = vec![env_ptr.into(), arg_value.into()];
let call_result = builder.build_indirect_call(closure_fn_type, fn_ptr, &call_args, "closure_call").unwrap();
}

Generated LLVM IR:

%fn_ptr_ptr = getelementptr %Closure, ptr %closure, i32 0, i32 0
%env_ptr_ptr = getelementptr %Closure, ptr %closure, i32 0, i32 1
%fn_ptr = load ptr, ptr %fn_ptr_ptr
%env_ptr = load ptr, ptr %env_ptr_ptr
%closure_call = call i64 %fn_ptr(ptr %env_ptr, i64 42)

Method Calls and Object-Oriented Patterns

Why method calls need special handling: Y Lang method calls require dynamic dispatch, self parameter handling, and potentially virtual function tables for polymorphism.

Simple Method Call

Method calls pass the receiver as the first parameter:

#![allow(unused)]
fn main() {
// Y Lang: point.distance_from_origin()
// Where point is a struct with x, y fields

// Method function: fn distance_from_origin(self: &Point) -> f64
let point_type = context.struct_type(&[i64_type.into(), i64_type.into()], false);
let f64_type = context.f64_type();

let method_param_types = vec![
    BasicMetadataTypeEnum::PointerType(point_type.ptr_type(Default::default())), // &self
];
let method_fn_type = f64_type.fn_type(&method_param_types, false);
let method_function = module.add_function("Point_distance_from_origin", method_fn_type, None);

// Implement method body
let method_entry = context.append_basic_block(method_function, "entry");
builder.position_at_end(method_entry);

let self_param = method_function.get_nth_param(0).unwrap().into_pointer_value();

// Load x and y fields
let x_ptr = builder.build_struct_gep(point_type, self_param, 0, "x_ptr").unwrap();
let y_ptr = builder.build_struct_gep(point_type, self_param, 1, "y_ptr").unwrap();

let x_val = builder.build_load(i64_type, x_ptr, "x").unwrap().into_int_value();
let y_val = builder.build_load(i64_type, y_ptr, "y").unwrap().into_int_value();

// Convert to float for calculation
let x_float = builder.build_signed_int_to_float(x_val, f64_type, "x_float").unwrap();
let y_float = builder.build_signed_int_to_float(y_val, f64_type, "y_float").unwrap();

// Calculate sqrt(x^2 + y^2)
let x_squared = builder.build_float_mul(x_float, x_float, "x_squared").unwrap();
let y_squared = builder.build_float_mul(y_float, y_float, "y_squared").unwrap();
let sum_squares = builder.build_float_add(x_squared, y_squared, "sum_squares").unwrap();

// Call sqrt intrinsic
let sqrt_intrinsic = module.get_function("llvm.sqrt.f64").unwrap_or_else(|| {
    let sqrt_fn_type = f64_type.fn_type(&[BasicMetadataTypeEnum::FloatType(f64_type)], false);
    module.add_function("llvm.sqrt.f64", sqrt_fn_type, None)
});

let sqrt_result = builder.build_call(sqrt_intrinsic, &[sum_squares.into()], "distance").unwrap()
    .try_as_basic_value().left().unwrap();

builder.build_return(Some(&sqrt_result)).unwrap();

// Method call: point.distance_from_origin()
let point_alloca = builder.build_alloca(point_type, "point").unwrap();
// ... initialize point ...

let method_result = builder.build_call(method_function, &[point_alloca.into()], "method_call").unwrap();
}

Generated LLVM IR:

define double @Point_distance_from_origin(ptr %0) {
entry:
  %x_ptr = getelementptr { i64, i64 }, ptr %0, i32 0, i32 0
  %y_ptr = getelementptr { i64, i64 }, ptr %0, i32 0, i32 1
  %x = load i64, ptr %x_ptr
  %y = load i64, ptr %y_ptr
  %x_float = sitofp i64 %x to double
  %y_float = sitofp i64 %y to double
  %x_squared = fmul double %x_float, %x_float
  %y_squared = fmul double %y_float, %y_float
  %sum_squares = fadd double %x_squared, %y_squared
  %distance = call double @llvm.sqrt.f64(double %sum_squares)
  ret double %distance
}

; Method call:
%method_call = call double @Point_distance_from_origin(ptr %point)

Method Chaining

Method chaining requires careful handling of return values:

#![allow(unused)]
fn main() {
// Y Lang: builder.add(1).multiply(2).build()
// Each method returns self or a new value

// Methods that return self for chaining
let builder_type = context.struct_type(&[i64_type.into()], false); // { value: i64 }

// add method: fn add(mut self, n: i64) -> Self
let add_method_params = vec![
    BasicMetadataTypeEnum::PointerType(builder_type.ptr_type(Default::default())), // &mut self
    BasicMetadataTypeEnum::IntType(i64_type), // n
];
let add_method_type = context.void_type().fn_type(&add_method_params, false);
let add_method = module.add_function("Builder_add", add_method_type, None);

let add_entry = context.append_basic_block(add_method, "entry");
builder.position_at_end(add_entry);

let self_param = add_method.get_nth_param(0).unwrap().into_pointer_value();
let n_param = add_method.get_nth_param(1).unwrap().into_int_value();

// Modify self.value += n
let value_ptr = builder.build_struct_gep(builder_type, self_param, 0, "value_ptr").unwrap();
let current_value = builder.build_load(i64_type, value_ptr, "current").unwrap().into_int_value();
let new_value = builder.build_int_add(current_value, n_param, "new_value").unwrap();
builder.build_store(value_ptr, new_value).unwrap();

builder.build_return(None).unwrap(); // Void return, self is modified in place

// Chained call: builder.add(1).multiply(2)
let builder_alloca = builder.build_alloca(builder_type, "builder").unwrap();

// First call: builder.add(1)
builder.build_call(add_method, &[builder_alloca.into(), i64_type.const_int(1, false).into()], "add_call").unwrap();

// Second call on same object: .multiply(2)
// multiply_method implementation would be similar...
}

Pattern Matching

Why pattern matching is complex: Pattern matching requires decision trees, value extraction, and exhaustiveness checking while maintaining efficient branching.

Simple Pattern Matching

Y Lang pattern matching on enums and values:

#![allow(unused)]
fn main() {
// Y Lang enum: enum Option<T> { None, Some(T) }
// Pattern match: match option { None => 0, Some(x) => x }

// Represent Option<i64> as tagged union
let option_type = context.struct_type(&[
    context.i8_type().into(),  // tag: 0 = None, 1 = Some
    i64_type.into(),           // value (only valid for Some)
], false);

// Pattern matching function
let match_param_types = vec![BasicMetadataTypeEnum::StructType(option_type)];
let match_fn_type = i64_type.fn_type(&match_param_types, false);
let match_function = module.add_function("match_option", match_fn_type, None);

let entry_block = context.append_basic_block(match_function, "entry");
let none_block = context.append_basic_block(match_function, "match_none");
let some_block = context.append_basic_block(match_function, "match_some");
let merge_block = context.append_basic_block(match_function, "merge");

builder.position_at_end(entry_block);

let option_param = match_function.get_nth_param(0).unwrap().into_struct_value();

// Extract tag field
let tag_value = builder.build_extract_value(option_param, 0, "tag").unwrap().into_int_value();

// Switch on tag
let zero_tag = context.i8_type().const_int(0, false);
let one_tag = context.i8_type().const_int(1, false);

let is_none = builder.build_int_compare(IntPredicate::EQ, tag_value, zero_tag, "is_none").unwrap();
builder.build_conditional_branch(is_none, none_block, some_block).unwrap();

// None case: return 0
builder.position_at_end(none_block);
let none_result = i64_type.const_int(0, false);
builder.build_unconditional_branch(merge_block).unwrap();

// Some case: extract and return value
builder.position_at_end(some_block);
let some_value = builder.build_extract_value(option_param, 1, "some_value").unwrap().into_int_value();
builder.build_unconditional_branch(merge_block).unwrap();

// Merge results
builder.position_at_end(merge_block);
let result_phi = builder.build_phi(i64_type, "match_result").unwrap();
result_phi.add_incoming(&[
    (&none_result, none_block),
    (&some_value, some_block),
]);

builder.build_return(Some(&result_phi.as_basic_value())).unwrap();
}

Generated LLVM IR:

define i64 @match_option({ i8, i64 } %0) {
entry:
  %tag = extractvalue { i8, i64 } %0, 0
  %is_none = icmp eq i8 %tag, 0
  br i1 %is_none, label %match_none, label %match_some

match_none:
  br label %merge

match_some:
  %some_value = extractvalue { i8, i64 } %0, 1
  br label %merge

merge:
  %match_result = phi i64 [ 0, %match_none ], [ %some_value, %match_some ]
  ret i64 %match_result
}

Complex Pattern Matching with Guards

Pattern matching with additional conditions:

#![allow(unused)]
fn main() {
// Y Lang: match (x, y) { (0, _) => "zero x", (_, 0) => "zero y", (a, b) if a == b => "equal", _ => "other" }

// This requires multiple decision points and guard evaluation
let tuple_type = context.struct_type(&[i64_type.into(), i64_type.into()], false);
let str_type = context.ptr_type(Default::default()); // String representation

let complex_match_type = str_type.fn_type(&[BasicMetadataTypeEnum::StructType(tuple_type)], false);
let complex_match = module.add_function("complex_match", complex_match_type, None);

let entry = context.append_basic_block(complex_match, "entry");
let check_x_zero = context.append_basic_block(complex_match, "check_x_zero");
let check_y_zero = context.append_basic_block(complex_match, "check_y_zero");
let check_equal = context.append_basic_block(complex_match, "check_equal");
let case_x_zero = context.append_basic_block(complex_match, "case_x_zero");
let case_y_zero = context.append_basic_block(complex_match, "case_y_zero");
let case_equal = context.append_basic_block(complex_match, "case_equal");
let case_other = context.append_basic_block(complex_match, "case_other");
let merge = context.append_basic_block(complex_match, "merge");

builder.position_at_end(entry);
let tuple_param = complex_match.get_nth_param(0).unwrap().into_struct_value();

// Extract tuple elements
let x = builder.build_extract_value(tuple_param, 0, "x").unwrap().into_int_value();
let y = builder.build_extract_value(tuple_param, 1, "y").unwrap().into_int_value();

builder.build_unconditional_branch(check_x_zero).unwrap();

// Check if x == 0
builder.position_at_end(check_x_zero);
let zero = i64_type.const_int(0, false);
let x_is_zero = builder.build_int_compare(IntPredicate::EQ, x, zero, "x_is_zero").unwrap();
builder.build_conditional_branch(x_is_zero, case_x_zero, check_y_zero).unwrap();

// Check if y == 0 (and x != 0)
builder.position_at_end(check_y_zero);
let y_is_zero = builder.build_int_compare(IntPredicate::EQ, y, zero, "y_is_zero").unwrap();
builder.build_conditional_branch(y_is_zero, case_y_zero, check_equal).unwrap();

// Check if x == y (guard condition)
builder.position_at_end(check_equal);
let x_equals_y = builder.build_int_compare(IntPredicate::EQ, x, y, "x_equals_y").unwrap();
builder.build_conditional_branch(x_equals_y, case_equal, case_other).unwrap();

// Case implementations would create string constants and branch to merge...
// Each case creates its result and branches to merge block with PHI node
}

Advanced Optimization Patterns

Tail Call Optimization

For recursive lambdas and functions:

#![allow(unused)]
fn main() {
// Enable tail call optimization for recursive calls
let recursive_call = builder.build_call(function, &args, "tail_call").unwrap();
recursive_call.set_tail_call(true);

// This helps LLVM optimize recursive patterns
}

Inline Assembly for Performance Critical Code

When Y Lang needs low-level operations:

#![allow(unused)]
fn main() {
// Inline assembly for special operations
let asm_type = context.void_type().fn_type(&[i64_type.into()], false);
let inline_asm = context.create_inline_asm(
    asm_type,
    "nop".to_string(),
    "r".to_string(),
    true,  // has side effects
    false, // is align stack
    None,
);

builder.build_call(inline_asm, &[i64_type.const_int(42, false).into()], "asm_call").unwrap();
}

Function Specialization

Creating specialized versions of generic functions:

#![allow(unused)]
fn main() {
// Template: fn map<T, U>(arr: [T], f: T -> U) -> [U]
// Specialized: fn map_i64_to_f64(arr: [i64], f: i64 -> f64) -> [f64]

fn specialize_function(
    context: &Context,
    module: &Module,
    generic_name: &str,
    type_args: &[BasicTypeEnum]
) -> FunctionValue {
    // Generate specialized function name
    let specialized_name = format!("{}_{}", generic_name, mangle_types(type_args));

    // Create specialized function with concrete types
    // Implementation depends on the specific generic function

    // Return specialized function
    module.get_function(&specialized_name).unwrap()
}
}

Error Handling in Advanced Constructs

Safe Pattern Matching

Ensure exhaustiveness and prevent runtime errors:

#![allow(unused)]
fn main() {
fn validate_pattern_match(
    patterns: &[Pattern],
    input_type: &Type
) -> Result<(), String> {
    // Check pattern exhaustiveness
    if !is_exhaustive(patterns, input_type) {
        return Err("Pattern match is not exhaustive".to_string());
    }

    // Validate pattern types
    for pattern in patterns {
        if !pattern.matches_type(input_type) {
            return Err(format!("Pattern {:?} doesn't match type {:?}", pattern, input_type));
        }
    }

    Ok(())
}
}

Closure Environment Validation

Ensure captured variables are valid:

#![allow(unused)]
fn main() {
fn validate_closure_capture(
    captured_vars: &[Variable],
    closure_scope: &Scope
) -> Result<(), String> {
    for var in captured_vars {
        if !closure_scope.can_capture(var) {
            return Err(format!("Cannot capture variable {:?} in closure", var.name));
        }

        if var.is_moved() {
            return Err(format!("Cannot capture moved variable {:?}", var.name));
        }
    }

    Ok(())
}
}

This comprehensive coverage of advanced constructs provides the foundation for implementing Y Lang's sophisticated language features in LLVM, emphasizing proper memory management, type safety, and optimization considerations for complex control patterns.

Comprehensive Examples

This section provides complete, end-to-end examples that demonstrate how to implement complex Y Lang programs using Inkwell, combining multiple concepts from previous sections into working code generation patterns.

Example 1: Complete Function with Local Variables

Y Lang Source:

fn calculate_area(width: i64, height: i64) -> i64 {
    let area = width * height;
    let doubled = area * 2;
    doubled
}

Complete Inkwell Implementation:

#![allow(unused)]
fn main() {
use inkwell::context::Context;
use inkwell::types::BasicMetadataTypeEnum;

fn generate_calculate_area(context: &Context, module: &Module, builder: &Builder) {
    let i64_type = context.i64_type();

    // 1. Create function signature
    let param_types = vec![
        BasicMetadataTypeEnum::IntType(i64_type),
        BasicMetadataTypeEnum::IntType(i64_type),
    ];
    let fn_type = i64_type.fn_type(&param_types, false);
    let function = module.add_function("calculate_area", fn_type, None);

    // 2. Create entry block
    let entry_block = context.append_basic_block(function, "entry");
    builder.position_at_end(entry_block);

    // 3. Access parameters
    let width = function.get_nth_param(0).unwrap().into_int_value();
    let height = function.get_nth_param(1).unwrap().into_int_value();

    // 4. Allocate local variables
    let area_alloca = builder.build_alloca(i64_type, "area").unwrap();
    let doubled_alloca = builder.build_alloca(i64_type, "doubled").unwrap();

    // 5. Calculate and store area = width * height
    let area_value = builder.build_int_mul(width, height, "area_calc").unwrap();
    builder.build_store(area_alloca, area_value).unwrap();

    // 6. Calculate and store doubled = area * 2
    let area_loaded = builder.build_load(i64_type, area_alloca, "area_val").unwrap().into_int_value();
    let two = i64_type.const_int(2, false);
    let doubled_value = builder.build_int_mul(area_loaded, two, "doubled_calc").unwrap();
    builder.build_store(doubled_alloca, doubled_value).unwrap();

    // 7. Return doubled (last expression)
    let result = builder.build_load(i64_type, doubled_alloca, "result").unwrap();
    builder.build_return(Some(&result)).unwrap();
}
}

Generated LLVM IR:

define i64 @calculate_area(i64 %0, i64 %1) {
entry:
  %area = alloca i64
  %doubled = alloca i64
  %area_calc = mul i64 %0, %1
  store i64 %area_calc, ptr %area
  %area_val = load i64, ptr %area
  %doubled_calc = mul i64 %area_val, 2
  store i64 %doubled_calc, ptr %doubled
  %result = load i64, ptr %doubled
  ret i64 %result
}

Key Implementation Steps:

  1. Define function signature with parameter types
  2. Create entry basic block for function body
  3. Extract parameters using get_nth_param()
  4. Allocate local variables with build_alloca()
  5. Generate computation instructions
  6. Store intermediate results in local variables
  7. Load final result and return it

Example 2: Conditional Expression with Complex Logic

Y Lang Source:

fn grade_classifier(score: i64) -> i64 {
    if score >= 90 {
        1  // A grade
    } else if score >= 80 {
        2  // B grade
    } else if score >= 70 {
        3  // C grade
    } else {
        4  // F grade
    }
}

Complete Inkwell Implementation:

#![allow(unused)]
fn main() {
fn generate_grade_classifier(context: &Context, module: &Module, builder: &Builder) {
    let i64_type = context.i64_type();

    // Function signature
    let param_types = vec![BasicMetadataTypeEnum::IntType(i64_type)];
    let fn_type = i64_type.fn_type(&param_types, false);
    let function = module.add_function("grade_classifier", fn_type, None);

    // Create all basic blocks
    let entry_block = context.append_basic_block(function, "entry");
    let check_90_block = context.append_basic_block(function, "check_90");
    let check_80_block = context.append_basic_block(function, "check_80");
    let check_70_block = context.append_basic_block(function, "check_70");
    let grade_a_block = context.append_basic_block(function, "grade_a");
    let grade_b_block = context.append_basic_block(function, "grade_b");
    let grade_c_block = context.append_basic_block(function, "grade_c");
    let grade_f_block = context.append_basic_block(function, "grade_f");
    let merge_block = context.append_basic_block(function, "merge");

    // Entry: get parameter and start checking
    builder.position_at_end(entry_block);
    let score = function.get_nth_param(0).unwrap().into_int_value();
    builder.build_unconditional_branch(check_90_block).unwrap();

    // Check if score >= 90
    builder.position_at_end(check_90_block);
    let ninety = i64_type.const_int(90, false);
    let is_90_or_above = builder.build_int_compare(
        IntPredicate::SGE, score, ninety, "is_90_or_above"
    ).unwrap();
    builder.build_conditional_branch(is_90_or_above, grade_a_block, check_80_block).unwrap();

    // Check if score >= 80
    builder.position_at_end(check_80_block);
    let eighty = i64_type.const_int(80, false);
    let is_80_or_above = builder.build_int_compare(
        IntPredicate::SGE, score, eighty, "is_80_or_above"
    ).unwrap();
    builder.build_conditional_branch(is_80_or_above, grade_b_block, check_70_block).unwrap();

    // Check if score >= 70
    builder.position_at_end(check_70_block);
    let seventy = i64_type.const_int(70, false);
    let is_70_or_above = builder.build_int_compare(
        IntPredicate::SGE, score, seventy, "is_70_or_above"
    ).unwrap();
    builder.build_conditional_branch(is_70_or_above, grade_c_block, grade_f_block).unwrap();

    // Grade outcomes
    builder.position_at_end(grade_a_block);
    let grade_a = i64_type.const_int(1, false);
    builder.build_unconditional_branch(merge_block).unwrap();

    builder.position_at_end(grade_b_block);
    let grade_b = i64_type.const_int(2, false);
    builder.build_unconditional_branch(merge_block).unwrap();

    builder.position_at_end(grade_c_block);
    let grade_c = i64_type.const_int(3, false);
    builder.build_unconditional_branch(merge_block).unwrap();

    builder.position_at_end(grade_f_block);
    let grade_f = i64_type.const_int(4, false);
    builder.build_unconditional_branch(merge_block).unwrap();

    // Merge all paths with PHI
    builder.position_at_end(merge_block);
    let phi = builder.build_phi(i64_type, "grade_result").unwrap();
    phi.add_incoming(&[
        (&grade_a, grade_a_block),
        (&grade_b, grade_b_block),
        (&grade_c, grade_c_block),
        (&grade_f, grade_f_block),
    ]);

    builder.build_return(Some(&phi.as_basic_value())).unwrap();
}
}

Generated LLVM IR:

define i64 @grade_classifier(i64 %0) {
entry:
  br label %check_90

check_90:
  %is_90_or_above = icmp sge i64 %0, 90
  br i1 %is_90_or_above, label %grade_a, label %check_80

check_80:
  %is_80_or_above = icmp sge i64 %0, 80
  br i1 %is_80_or_above, label %grade_b, label %check_70

check_70:
  %is_70_or_above = icmp sge i64 %0, 70
  br i1 %is_70_or_above, label %grade_c, label %grade_f

grade_a:
  br label %merge

grade_b:
  br label %merge

grade_c:
  br label %merge

grade_f:
  br label %merge

merge:
  %grade_result = phi i64 [ 1, %grade_a ], [ 2, %grade_b ], [ 3, %grade_c ], [ 4, %grade_f ]
  ret i64 %grade_result
}

Example 3: Struct with Methods and Array Processing

Y Lang Source:

struct Point {
    x: i64,
    y: i64
}

fn process_points(points: [Point; 3]) -> i64 {
    let mut sum = 0;
    let mut i = 0;
    while i < 3 {
        let point = points[i];
        sum = sum + point.x + point.y;
        i = i + 1;
    }
    sum
}

Complete Inkwell Implementation:

#![allow(unused)]
fn main() {
fn generate_point_processing(context: &Context, module: &Module, builder: &Builder) {
    let i64_type = context.i64_type();

    // 1. Define Point struct type
    let point_type = context.struct_type(&[
        i64_type.into(), // x field
        i64_type.into(), // y field
    ], false);

    // 2. Define array type: [Point; 3]
    let point_array_type = point_type.array_type(3);

    // 3. Function signature: fn process_points(points: [Point; 3]) -> i64
    let param_types = vec![BasicMetadataTypeEnum::ArrayType(point_array_type)];
    let fn_type = i64_type.fn_type(&param_types, false);
    let function = module.add_function("process_points", fn_type, None);

    // 4. Create basic blocks
    let entry_block = context.append_basic_block(function, "entry");
    let loop_header = context.append_basic_block(function, "loop_header");
    let loop_body = context.append_basic_block(function, "loop_body");
    let loop_exit = context.append_basic_block(function, "loop_exit");

    // 5. Entry block: initialize variables
    builder.position_at_end(entry_block);

    // Copy array parameter to local memory
    let points_param = function.get_nth_param(0).unwrap().into_array_value();
    let points_alloca = builder.build_alloca(point_array_type, "points").unwrap();
    builder.build_store(points_alloca, points_param).unwrap();

    // Initialize local variables
    let sum_alloca = builder.build_alloca(i64_type, "sum").unwrap();
    let i_alloca = builder.build_alloca(i64_type, "i").unwrap();

    let zero = i64_type.const_zero();
    builder.build_store(sum_alloca, zero).unwrap();
    builder.build_store(i_alloca, zero).unwrap();

    builder.build_unconditional_branch(loop_header).unwrap();

    // 6. Loop header: check condition i < 3
    builder.position_at_end(loop_header);
    let current_i = builder.build_load(i64_type, i_alloca, "current_i").unwrap().into_int_value();
    let three = i64_type.const_int(3, false);
    let condition = builder.build_int_compare(
        IntPredicate::SLT, current_i, three, "i_lt_3"
    ).unwrap();
    builder.build_conditional_branch(condition, loop_body, loop_exit).unwrap();

    // 7. Loop body: process array element
    builder.position_at_end(loop_body);

    // Access points[i]
    let zero_idx = i64_type.const_zero();
    let current_i_body = builder.build_load(i64_type, i_alloca, "i_for_access").unwrap().into_int_value();

    let point_ptr = unsafe {
        builder.build_gep(
            point_array_type,
            points_alloca,
            &[zero_idx, current_i_body],
            "point_ptr"
        ).unwrap()
    };

    // Load point struct
    let point = builder.build_load(point_type, point_ptr, "point").unwrap().into_struct_value();

    // Extract x and y fields
    let point_x = builder.build_extract_value(point, 0, "point_x").unwrap().into_int_value();
    let point_y = builder.build_extract_value(point, 1, "point_y").unwrap().into_int_value();

    // Update sum: sum = sum + point.x + point.y
    let current_sum = builder.build_load(i64_type, sum_alloca, "current_sum").unwrap().into_int_value();
    let sum_plus_x = builder.build_int_add(current_sum, point_x, "sum_plus_x").unwrap();
    let new_sum = builder.build_int_add(sum_plus_x, point_y, "new_sum").unwrap();
    builder.build_store(sum_alloca, new_sum).unwrap();

    // Update i: i = i + 1
    let one = i64_type.const_int(1, false);
    let i_for_increment = builder.build_load(i64_type, i_alloca, "i_for_inc").unwrap().into_int_value();
    let new_i = builder.build_int_add(i_for_increment, one, "new_i").unwrap();
    builder.build_store(i_alloca, new_i).unwrap();

    builder.build_unconditional_branch(loop_header).unwrap();

    // 8. Loop exit: return sum
    builder.position_at_end(loop_exit);
    let final_sum = builder.build_load(i64_type, sum_alloca, "final_sum").unwrap();
    builder.build_return(Some(&final_sum)).unwrap();
}
}

Generated LLVM IR:

%Point = type { i64, i64 }

define i64 @process_points([3 x %Point] %0) {
entry:
  %points = alloca [3 x %Point]
  store [3 x %Point] %0, ptr %points
  %sum = alloca i64
  %i = alloca i64
  store i64 0, ptr %sum
  store i64 0, ptr %i
  br label %loop_header

loop_header:
  %current_i = load i64, ptr %i
  %i_lt_3 = icmp slt i64 %current_i, 3
  br i1 %i_lt_3, label %loop_body, label %loop_exit

loop_body:
  %i_for_access = load i64, ptr %i
  %point_ptr = getelementptr [3 x %Point], ptr %points, i64 0, i64 %i_for_access
  %point = load %Point, ptr %point_ptr
  %point_x = extractvalue %Point %point, 0
  %point_y = extractvalue %Point %point, 1
  %current_sum = load i64, ptr %sum
  %sum_plus_x = add i64 %current_sum, %point_x
  %new_sum = add i64 %sum_plus_x, %point_y
  store i64 %new_sum, ptr %sum
  %i_for_inc = load i64, ptr %i
  %new_i = add i64 %i_for_inc, 1
  store i64 %new_i, ptr %i
  br label %loop_header

loop_exit:
  %final_sum = load i64, ptr %sum
  ret i64 %final_sum
}

Example 4: Higher-Order Function with Closure

Y Lang Source:

fn map_and_sum(arr: [i64; 3], transform: |i64| -> i64) -> i64 {
    let mut sum = 0;
    let mut i = 0;
    while i < 3 {
        let transformed = transform(arr[i]);
        sum = sum + transformed;
        i = i + 1;
    }
    sum
}

fn main() -> i64 {
    let numbers = [1, 2, 3];
    let multiplier = 10;
    let closure = |x| x * multiplier;
    map_and_sum(numbers, closure)
}

Complete Inkwell Implementation:

#![allow(unused)]
fn main() {
fn generate_closure_example(context: &Context, module: &Module, builder: &Builder) {
    let i64_type = context.i64_type();
    let ptr_type = context.ptr_type(Default::default());

    // 1. Define closure environment for captured variables
    let closure_env_type = context.struct_type(&[
        i64_type.into(), // captured multiplier
    ], false);

    // 2. Define closure function type: (env*, i64) -> i64
    let closure_fn_param_types = vec![
        BasicMetadataTypeEnum::PointerType(closure_env_type.ptr_type(Default::default())),
        BasicMetadataTypeEnum::IntType(i64_type),
    ];
    let closure_fn_type = i64_type.fn_type(&closure_fn_param_types, false);

    // 3. Define closure representation: {fn_ptr, env_ptr}
    let closure_type = context.struct_type(&[
        closure_fn_type.ptr_type(Default::default()).into(),
        closure_env_type.ptr_type(Default::default()).into(),
    ], false);

    // 4. Generate the closure function: |x| x * multiplier
    let closure_function = module.add_function("closure_multiply", closure_fn_type, None);
    let closure_entry = context.append_basic_block(closure_function, "entry");
    builder.position_at_end(closure_entry);

    let env_param = closure_function.get_nth_param(0).unwrap().into_pointer_value();
    let x_param = closure_function.get_nth_param(1).unwrap().into_int_value();

    // Extract multiplier from environment
    let multiplier_ptr = builder.build_struct_gep(closure_env_type, env_param, 0, "multiplier_ptr").unwrap();
    let multiplier = builder.build_load(i64_type, multiplier_ptr, "multiplier").unwrap().into_int_value();

    // Compute x * multiplier
    let result = builder.build_int_mul(x_param, multiplier, "multiply_result").unwrap();
    builder.build_return(Some(&result)).unwrap();

    // 5. Generate map_and_sum function
    let array_type = i64_type.array_type(3);
    let map_sum_param_types = vec![
        BasicMetadataTypeEnum::ArrayType(array_type),
        BasicMetadataTypeEnum::StructType(closure_type),
    ];
    let map_sum_fn_type = i64_type.fn_type(&map_sum_param_types, false);
    let map_sum_function = module.add_function("map_and_sum", map_sum_fn_type, None);

    // Map and sum implementation blocks
    let entry_block = context.append_basic_block(map_sum_function, "entry");
    let loop_header = context.append_basic_block(map_sum_function, "loop_header");
    let loop_body = context.append_basic_block(map_sum_function, "loop_body");
    let loop_exit = context.append_basic_block(map_sum_function, "loop_exit");

    builder.position_at_end(entry_block);

    // Copy parameters to local memory
    let arr_param = map_sum_function.get_nth_param(0).unwrap().into_array_value();
    let closure_param = map_sum_function.get_nth_param(1).unwrap().into_struct_value();

    let arr_alloca = builder.build_alloca(array_type, "arr").unwrap();
    let closure_alloca = builder.build_alloca(closure_type, "closure").unwrap();

    builder.build_store(arr_alloca, arr_param).unwrap();
    builder.build_store(closure_alloca, closure_param).unwrap();

    // Initialize loop variables
    let sum_alloca = builder.build_alloca(i64_type, "sum").unwrap();
    let i_alloca = builder.build_alloca(i64_type, "i").unwrap();

    let zero = i64_type.const_zero();
    builder.build_store(sum_alloca, zero).unwrap();
    builder.build_store(i_alloca, zero).unwrap();

    builder.build_unconditional_branch(loop_header).unwrap();

    // Loop header
    builder.position_at_end(loop_header);
    let current_i = builder.build_load(i64_type, i_alloca, "current_i").unwrap().into_int_value();
    let three = i64_type.const_int(3, false);
    let condition = builder.build_int_compare(IntPredicate::SLT, current_i, three, "i_lt_3").unwrap();
    builder.build_conditional_branch(condition, loop_body, loop_exit).unwrap();

    // Loop body: call closure with array element
    builder.position_at_end(loop_body);

    // Get arr[i]
    let zero_idx = i64_type.const_zero();
    let i_for_access = builder.build_load(i64_type, i_alloca, "i_for_access").unwrap().into_int_value();
    let elem_ptr = unsafe {
        builder.build_gep(array_type, arr_alloca, &[zero_idx, i_for_access], "elem_ptr").unwrap()
    };
    let elem_value = builder.build_load(i64_type, elem_ptr, "elem_value").unwrap().into_int_value();

    // Extract closure function and environment
    let closure_loaded = builder.build_load(closure_type, closure_alloca, "closure_loaded").unwrap().into_struct_value();
    let fn_ptr = builder.build_extract_value(closure_loaded, 0, "fn_ptr").unwrap().into_pointer_value();
    let env_ptr = builder.build_extract_value(closure_loaded, 1, "env_ptr").unwrap().into_pointer_value();

    // Call closure: transform(arr[i])
    let call_args = vec![env_ptr.into(), elem_value.into()];
    let transformed = builder.build_indirect_call(closure_fn_type, fn_ptr, &call_args, "transformed").unwrap()
        .try_as_basic_value().left().unwrap().into_int_value();

    // Update sum
    let current_sum = builder.build_load(i64_type, sum_alloca, "current_sum").unwrap().into_int_value();
    let new_sum = builder.build_int_add(current_sum, transformed, "new_sum").unwrap();
    builder.build_store(sum_alloca, new_sum).unwrap();

    // Update i
    let one = i64_type.const_int(1, false);
    let i_for_inc = builder.build_load(i64_type, i_alloca, "i_for_inc").unwrap().into_int_value();
    let new_i = builder.build_int_add(i_for_inc, one, "new_i").unwrap();
    builder.build_store(i_alloca, new_i).unwrap();

    builder.build_unconditional_branch(loop_header).unwrap();

    // Loop exit
    builder.position_at_end(loop_exit);
    let final_sum = builder.build_load(i64_type, sum_alloca, "final_sum").unwrap();
    builder.build_return(Some(&final_sum)).unwrap();

    // 6. Generate main function
    let main_fn_type = i64_type.fn_type(&[], false);
    let main_function = module.add_function("main", main_fn_type, None);
    let main_entry = context.append_basic_block(main_function, "entry");
    builder.position_at_end(main_entry);

    // Create numbers array: [1, 2, 3]
    let numbers_array = i64_type.const_array(&[
        i64_type.const_int(1, false),
        i64_type.const_int(2, false),
        i64_type.const_int(3, false),
    ]);

    // Create closure environment with multiplier = 10
    let multiplier_value = i64_type.const_int(10, false);
    let env_alloca = builder.build_alloca(closure_env_type, "env").unwrap();
    let multiplier_field = builder.build_struct_gep(closure_env_type, env_alloca, 0, "multiplier_field").unwrap();
    builder.build_store(multiplier_field, multiplier_value).unwrap();

    // Create closure struct
    let closure_struct_alloca = builder.build_alloca(closure_type, "closure_struct").unwrap();
    let fn_ptr_field = builder.build_struct_gep(closure_type, closure_struct_alloca, 0, "fn_ptr_field").unwrap();
    let env_ptr_field = builder.build_struct_gep(closure_type, closure_struct_alloca, 1, "env_ptr_field").unwrap();

    let closure_fn_ptr = closure_function.as_global_value().as_pointer_value();
    builder.build_store(fn_ptr_field, closure_fn_ptr).unwrap();
    builder.build_store(env_ptr_field, env_alloca).unwrap();

    // Call map_and_sum
    let closure_struct_value = builder.build_load(closure_type, closure_struct_alloca, "closure_value").unwrap();
    let call_args = vec![numbers_array.into(), closure_struct_value.into()];
    let result = builder.build_call(map_sum_function, &call_args, "map_sum_result").unwrap()
        .try_as_basic_value().left().unwrap();

    builder.build_return(Some(&result)).unwrap();
}
}

Generated LLVM IR:

%ClosureEnv = type { i64 }
%Closure = type { ptr, ptr }

define i64 @closure_multiply(ptr %0, i64 %1) {
entry:
  %multiplier_ptr = getelementptr %ClosureEnv, ptr %0, i32 0, i32 0
  %multiplier = load i64, ptr %multiplier_ptr
  %multiply_result = mul i64 %1, %multiplier
  ret i64 %multiply_result
}

define i64 @map_and_sum([3 x i64] %0, %Closure %1) {
entry:
  %arr = alloca [3 x i64]
  store [3 x i64] %0, ptr %arr
  %closure = alloca %Closure
  store %Closure %1, ptr %closure
  %sum = alloca i64
  %i = alloca i64
  store i64 0, ptr %sum
  store i64 0, ptr %i
  br label %loop_header

loop_header:
  %current_i = load i64, ptr %i
  %i_lt_3 = icmp slt i64 %current_i, 3
  br i1 %i_lt_3, label %loop_body, label %loop_exit

loop_body:
  %i_for_access = load i64, ptr %i
  %elem_ptr = getelementptr [3 x i64], ptr %arr, i64 0, i64 %i_for_access
  %elem_value = load i64, ptr %elem_ptr
  %closure_loaded = load %Closure, ptr %closure
  %fn_ptr = extractvalue %Closure %closure_loaded, 0
  %env_ptr = extractvalue %Closure %closure_loaded, 1
  %transformed = call i64 %fn_ptr(ptr %env_ptr, i64 %elem_value)
  %current_sum = load i64, ptr %sum
  %new_sum = add i64 %current_sum, %transformed
  store i64 %new_sum, ptr %sum
  %i_for_inc = load i64, ptr %i
  %new_i = add i64 %i_for_inc, 1
  store i64 %new_i, ptr %i
  br label %loop_header

loop_exit:
  %final_sum = load i64, ptr %sum
  ret i64 %final_sum
}

define i64 @main() {
entry:
  %env = alloca %ClosureEnv
  %multiplier_field = getelementptr %ClosureEnv, ptr %env, i32 0, i32 0
  store i64 10, ptr %multiplier_field
  %closure_struct = alloca %Closure
  %fn_ptr_field = getelementptr %Closure, ptr %closure_struct, i32 0, i32 0
  %env_ptr_field = getelementptr %Closure, ptr %closure_struct, i32 0, i32 1
  store ptr @closure_multiply, ptr %fn_ptr_field
  store ptr %env, ptr %env_ptr_field
  %closure_value = load %Closure, ptr %closure_struct
  %map_sum_result = call i64 @map_and_sum([3 x i64] [i64 1, i64 2, i64 3], %Closure %closure_value)
  ret i64 %map_sum_result
}

Implementation Patterns Summary

Pattern 1: Function Structure

  1. Define signature - Parameter types and return type
  2. Create entry block - Starting point for function body
  3. Handle parameters - Extract and optionally allocate for mutation
  4. Generate body - Implement function logic with LLVM instructions
  5. Handle return - Return value or void

Pattern 2: Control Flow

  1. Create all blocks first - Plan the control flow graph
  2. Position builder - Move between blocks systematically
  3. Generate conditions - Use comparison instructions
  4. Branch appropriately - Conditional or unconditional branches
  5. Merge with PHI - Combine values from different paths

Pattern 3: Data Structures

  1. Define types - Struct, array, or composite types
  2. Allocate storage - Stack allocation for local data
  3. Access elements - GEP for arrays, struct_gep for structs
  4. Load/store values - Move data between memory and registers
  5. Extract/insert - Work with composite values directly

Pattern 4: Complex Features

  1. Plan data representation - How to represent language features in LLVM
  2. Create helper structures - Environment structs for closures, etc.
  3. Generate helper functions - Functions that implement language semantics
  4. Coordinate multiple components - Tie together all the pieces
  5. Optimize for clarity - Keep generated code readable and efficient

These examples demonstrate how to combine the individual concepts from previous sections into complete, working implementations of Y Lang programs using Inkwell and LLVM IR generation.