Renyi Yang

Useless Machine - Arduino

2026-03-21T00:00:00+00:00

A Tkinter-based Arduino control hub for the useless machine and the DC motor position control interface.

a classic useless machine that reacts to physical switches with a servo and stepper carriage
a DC motor position control interface with P / PI control and live plotting through Banyan

The main application starts with a hardware keypad login, then routes the user into a menu where each activity can be launched, stopped, and restarted independently.

Project Overview

The repository is organized around a few focused modules:

global_interface.py is the main GUI application. It handles keypad authentication, the activity menu, runtime switching, and the live status dashboard.
keypad.py contains the keypad scanner abstraction used by the login flow.
machine.py drives the useless machine hardware: switches, stepper, and servo.
motor.py contains the DC motor, potentiometer, and fixed setpoint helpers.
motor_control.py implements the discrete P / PI position controller.
motor_plot.py publishes live motor telemetry over Banyan.
motor_plot_subscriber.py displays the live response plot in a separate process.
archive/ keeps older experiments and previous interfaces for reference.

Features

Callback-based key detection to avoid noisy polling on every edge.
Timeout handling, input clearing with *, and shutdown request with D.
A simple digit-code flow before the main menu unlocks.

Useless machine mode

Stepper homing on startup for a stable reference point.
Switch-driven behavior that moves the carriage to the active switch position.
Servo deploy / retract sequence with automatic return-to-home after inactivity.
Live telemetry in the GUI for switch states, home state, position, and action text.

DC motor mode

Manual or PC setpoint source selection.
P and PI controller modes with tunable Kp and Ki.
Anti-overreaction limits with bounded PWM output.
Wrapped ADC error computation for circular position behavior.
Live response publishing to a separate subscriber window.

Motor control graph examples

Media

Demo videos

Highlights

Keypad

Callback Triggered (just mark a flag, avoiding noise callback)
Main polling loop (check flag, execute action, reset flag)

DC Motor

The live plotting is decoupled from the main GUI to allow continuous updates without freezing the interface, using Banyan for inter-process communication.

0 & 1023 corner cases
Limited max pwm for no overreaction
Banyan: set, measure, timestamp
Kp, Ki (erase gap)

Useless Machine

The useless machine logic is event-driven, responding to switch state changes and automatically returning to home after inactivity.

Callback Homing - No strange noise (at the beginning)
Smooth Movement

Overall

The code is organized into focused modules for hardware control, GUI management, and control logic, making it easier to maintain and extend.

Hardware / Software

You will need:

Arduino board running Firmata-compatible control via pymata4
Matrix keypad
DC motor with driver stage
Potentiometers for position measurement and command input
Stepper motor, servo, and switches for the useless machine mechanism
Python packages: pymata4, matplotlib, and python-banyan

tkinter is used for the GUI and is typically included with standard Python installations.

How To Run

Start the Banyan backplane first, then launch the plot subscriber, then the GUI:

# 1
backplane

# 2
python ./motor_plot_subscriber.py

# 3
python ./global_interface.py

The GUI starts on the keypad login screen. After authentication, use the menu to enter either activity.

Suggested Workflow

Start the backplane.
Open the motor plot subscriber in a separate terminal.
Launch the global interface.
Use the keypad to log in.
Choose either Useless Machine or DC Motor from the menu.

Notes

The keypad code is defined in global_interface.py and can be changed there.
The separate plot window is intentionally decoupled from the GUI so the motor interface can keep running while the live graph updates.
The archive/ folder contains legacy versions of earlier keyboard and motor interfaces.

Image Processing

2026-03-21T00:00:00+00:00

A comprehensive collection of hands-on projects in image processing, implemented in Python with Jupyter notebooks.

Sessions Overview

Session	Topic	Key Techniques
BE 1	Thresholding & Morphology	Binary segmentation, morphological operators, image cleanup
BE 2	Geometric Transformations	Forward/backward mapping, coordinate systems, interpolation methods
BE 3	Feature Detection	Harris corner detection, scale-space blob detection, feature matching
BE 4	Bag of Visual Words	SIFT, KMeans clustering, TF-IDF, spatial pyramid pooling, image classification

Key Concepts

Image Segmentation: thresholding strategies, morphological operations, region-growing algorithms
Geometric Vision: coordinate transformations, image warping with bilinear interpolation, evaluation metrics (MSE, PSNR, SSIM)
Feature Extraction: Harris corner response, eigenvalue analysis, scale-invariant keypoint detection
Visual Recognition: building BoVW pipelines from scratch, vocabulary learning, similarity-based retrieval, SVM classification

Quick Start

Requirements:

Python 3.8+
See requirements.txt or install:

pip install numpy matplotlib scikit-image opencv-python scikit-learn scipy pandas jupyter

Run the notebooks:

Open any notebook in BE_session_*/ and execute cells top-to-bottom.

Repository Structure

.
├── BE_session_1/
│   ├── BE-Thresholding-Morphology-Student.ipynb
│   ├── Images/
│   └── defects/
├── BE_session_2/
│   ├── TD2_login1_login_2.ipynb
│   ├── parrot.jpg
│   └── ground_truth.npy
├── BE_session_3/
│   ├── BE3_login1_login_2.ipynb
│   └── [test images]
├── BE_session_4/
│   ├── BE4_BoVW_student1_student2.ipynb
│   └── TD4-Student/data-BE4/
│       ├── breastmnist_128.npz
│       └── SUN/ (10-class scene subset)
└── README.md

Technologies Used

Image Processing: scikit-image, OpenCV (cv2)
Numerical Computing: NumPy, SciPy
Machine Learning: scikit-learn (KMeans, SVM)
Visualization: Matplotlib
Development: Jupyter Notebooks

AMD HeterOCR

2025-11-05T00:00:00+00:00

A heterogeneous computing system that integrates CPU, GPU, and NPU resources to accelerate OCR tasks.

The system is designed to leverage the strengths of each processing unit to achieve high performance and efficiency in OCR applications.

AMD HeterOCR

CIIE 2025

A Go-based acoustic communication system

2025-09-20T00:00:00+00:00

A network communication system demonstrating UDP-based DNS resolution and TCP-based HTTP requests over an acoustic physical layer using Go.

This project implements essential transport layer protocols (UDP and TCP) over sound waves, enabling real-world internet services like domain name resolution and web page retrieval through audio signals.

Table of Contents
Overview
Core Features
Architecture
Prerequisites
Installation
Environment Setup
Usage
- DNS Query Example
- HTTP Request Example
Project Structure
Implementation Details
- UDP and DNS Implementation
- TCP and HTTP Implementation
Configuration Parameters
Troubleshooting

Overview

This project focuses on implementing transport layer protocols over acoustic communication. The main achievements are:

UDP-based DNS Resolution: A complete DNS client that sends DNS queries using UDP and resolves domain names (e.g., www.example.com → 93.184.216.34)
TCP-based HTTP Requests: A TCP client that establishes connections via 3-way handshake, sends HTTP GET requests, and receives web page content
Acoustic Physical Layer: Uses sound waves (audio signals) as the transmission medium instead of electrical cables

The system enables a node behind a NAT gateway to access internet services through acoustic communication with another node that has internet connectivity.

Core Features

UDP DNS Resolution

The DNS implementation provides domain name resolution functionality:

Key Components:

DNS Query Generation: Creates standard DNS query packets with proper format (DNS header, questions section, A record type)
UDP Transport: Wraps DNS queries in UDP datagrams (port 53)
Query ID Tracking: Assigns unique IDs to each query and tracks pending responses
Response Parsing: Extracts IP addresses from DNS response packets
Upstream DNS: Forwards queries to external DNS servers (1.1.1.1, 8.8.8.8)
Timeout Handling: 5-second timeout for unresponsive queries
Caching: Stores resolved domains for 5 minutes to reduce redundant queries

Workflow:

User executes ping www.example.com
Node 1 creates DNS query packet with unique ID
Query wrapped in UDP datagram and sent via acoustic link to Node 2
Node 2 (NAT gateway) forwards query to internet DNS server (1.1.1.1)
DNS response received from internet
Node 2 sends response back through acoustic link
Node 1 extracts IP address and proceeds with ping/curl

TCP HTTP Implementation

The TCP implementation enables reliable HTTP communication:

Key Components:

3-Way Handshake: Complete SYN → SYN-ACK → ACK connection establishment
State Machine: Tracks connection states (CLOSED → SYN_SENT → ESTABLISHED → FIN_WAIT)
Sequence Numbers: Maintains proper sequence and acknowledgment numbers
Data Transmission: Sends HTTP GET requests with PSH+ACK flags
Reliable Delivery: Handles ACKs and retransmissions
Connection Teardown: FIN handshake for clean connection closure
HTTP Request Generation: Constructs valid HTTP/1.0 GET requests
Response Buffering: Accumulates TCP segments into complete HTTP response

Workflow:

User executes curl www.example.com
Node 1 resolves domain to IP (93.184.216.34) via DNS
Node 1 initiates TCP connection:
- Sends SYN packet through acoustic link
- Node 2 forwards to internet server
- Receives SYN-ACK from server via Node 2
- Sends ACK to complete handshake
Node 1 sends HTTP GET request in PSH+ACK segment
Node 2 forwards request to web server
Web server responds with HTTP headers and HTML content
Node 2 forwards response segments back through acoustic link
Node 1 reassembles segments and displays HTTP response
Connection closed with FIN handshake

Lower Layers (Brief)

To support UDP and TCP over acoustic communication, several lower layers are implemented:

Physical Layer (src/phy/):

Chirp Preamble: 2-10kHz frequency sweep for frame synchronization
Manchester Encoding: Line coding for reliable bit transmission
CRC-16: Error detection for frame integrity
JACK Audio: 48kHz sample rate audio I/O

Data Link Layer (src/data_link/): [code]

CSMA/CA: Carrier sense multiple access with collision avoidance
ACK Frames: Acknowledgment frames for received data
Sliding Window: Selective repeat ARQ for reliable delivery
Frame Format: [Preamble Header Payload CRC]

Network Layer (src/ip/): [code]

IP Routing: Basic IPv4 packet forwarding
ICMP: Ping functionality (echo request/reply)
NAT: Network Address Translation for internet access

These layers provide the foundation for reliable packet delivery, you may check our previous repo’s code.

Architecture

┌─────────────────────────────────────────────────────────┐
│         Application Layer                               │
│         [HTTP Requests, DNS Queries]                    │
│         ↑                                               │
│         curl www.example.com                            │
│         ping www.google.com                             │
├─────────────────────────────────────────────────────────┤
│         Transport Layer                                 │
│         [TCP: 3-way handshake, reliable delivery]       │
│         [UDP: DNS queries to port 53]                   │
├─────────────────────────────────────────────────────────┤
│         Network Layer                                   │
│         [IP Routing, NAT, ICMP]                         │
├─────────────────────────────────────────────────────────┤
│         Data Link Layer                                 │
│         [CSMA/CA, Framing, ACK]                         │
├─────────────────────────────────────────────────────────┤
│         Physical Layer                                  │
│         [Chirp, Manchester, CRC]                        │
│         JACK Audio Server (48kHz)                       │
└─────────────────────────────────────────────────────────┘

Prerequisites

Go 1.25+: Install from golang.org
JACK Audio Server: Required for audio I/O
- Windows: Download from jackaudio.org
- Install ASIO4ALL driver for low-latency audio
Network Requirements:
- Microsoft KM-TEST Loopback Adapter
- Audio cables or speakers/microphones for acoustic coupling

Installation

Clone the repository:
Install Go dependencies:
```
go mod download
```

Environment Setup

1. Start JACK Server

cd "C:\Program Files\JACK2\"
.\jackd.exe -S -X winmme -dportaudio -d"ASIO::ASIO4ALL v2" -r48000 -p128

Parameters:

-r48000: Sample rate of 48kHz
-p128: Buffer size of 128 samples (adjust for latency/stability trade-off)

2. Install Microsoft Loopback Adapter

Open Device Manager
Click Action → Add legacy hardware
Select Install the hardware that I manually select from a list
Select Network adapters
Select Microsoft as manufacturer
Select Microsoft KM-TEST Loopback Adapter
Complete installation

3. Configure Network Interfaces

Configure static IP addresses for the Loopback Adapter:

Node 1 (Client): 172.18.0.1 / 255.255.255.0
Node 2 (Gateway): 172.18.0.2 / 255.255.255.0

Configure Gateway on Node 1:

Set default gateway: 172.18.0.2
Set DNS server: 8.8.8.8

4. Get Network Interface Name

# List all network interfaces
go run cmd\network_test\main.go list
# Interface name format: \Device\NPF_{XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}

Usage

DNS Query Example

Node 1 (Client):

go run cmd\node\main.go 1

Node 2 (NAT Gateway):

go run cmd\node\main.go 2

Execute DNS query on Node 1:

# In Node 1's console
ping www.google.com -n 4

Expected output:

[Node 1] Resolving domain: www.google.com
[DNS] Created query for www.google.com (ID: 1)
[Node 1] DNS query sent via acoustic link
[Node 2] Received DNS query, forwarding to 1.1.1.1
[Node 1] DNS response received: www.google.com -> 142.250.185.68
Pinging www.google.com [142.250.185.68] with 32 bytes of data:
Reply from 142.250.185.68: time=350ms
Reply from 142.250.185.68: time=345ms
...

HTTP Request Example

Execute HTTP request on Node 1:

# In Node 1's console
curl example.com

Expected output:

[Node 1] Resolving domain: www.example.com
[DNS] Created query for www.example.com (ID: 2)
[Node 1] Resolved www.example.com to 93.184.216.34
[TCP] Initiating connection to 93.184.216.34:80
[TCP] Sending SYN (Seq=0x12345678)
[TCP] Received SYN-ACK (Seq=XXXXXXX, Ack=0x12345679)
[TCP] Connection ESTABLISHED
[Curl] Connected! Sending HTTP Request...
[TCP] Sending HTTP GET request

--- HTTP Response ---
HTTP/1.0 200 OK
Content-Type: text/html
Content-Length: 1256




    Example Domain
...


    Example Domain
    This domain is for use in illustrative examples...


---------------------
[TCP] Closing connection
[TCP] Sent FIN

Project Structure

acoustic-link/
├── cmd/
│   ├── network_test/          # Network interface testing
│   │   └── main.go
│   └── node/                  # Main application
│       ├── main.go            # Node logic, DNS/TCP/HTTP handlers
│       ├── reception.go       # Acoustic signal reception
│       └── transmission.go    # Acoustic signal transmission
├── src/
│   ├── transport/             # ★ Transport Layer (UDP, TCP, DNS)
│   │   ├── udp.go             # UDP packet creation and parsing
│   │   ├── dns.go             # DNS client: query generation, response parsing
│   │   └── tcp.go             # TCP client: 3-way handshake, state machine
│   ├── ip/                    # Network layer (IP routing, NAT, ICMP)
│   │   ├── router.go
│   │   └── icmp.go
│   ├── data_link/             # Data link layer (CSMA/CA MAC)
│   │   └── mac.go
│   ├── phy/                   # Physical layer (chirp, CRC, Manchester)
│   │   ├── chirp.go
│   │   ├── crc.go
│   │   ├── frame.go
│   │   └── linecoding.go
│   ├── encode/                # Signal encoding
│   │   └── encoder.go
│   ├── decode/                # Signal decoding
│   │   └── decoder.go
│   └── utils/                 # Utilities
│       └── utils.go
├── docs/                      # Documentation
│   └── report.tex
├── go.mod
└── README.md

Key Files for UDP/DNS and TCP/HTTP:

src/transport/dns.go: DNS query/response handling
src/transport/udp.go: UDP packet creation
src/transport/tcp.go: TCP state machine and reliability
cmd/node/main.go: Application logic (RunCurl, RunPing)

Implementation Details

UDP and DNS Implementation

UDP Packet Creation (src/transport/udp.go):

func CreateUDPPacket(srcIP, dstIP net.IP, srcPort, dstPort uint16, payload []byte) ([]byte, error) {
    // 1. Create IPv4 layer
    ipLayer := &layers.IPv4{
        Version:  4,
        TTL:      64,
        SrcIP:    srcIP,
        DstIP:    dstIP,
        Protocol: layers.IPProtocolUDP,
    }
    
    // 2. Create UDP layer
    udpLayer := &layers.UDP{
        SrcPort: layers.UDPPort(srcPort),
        DstPort: layers.UDPPort(dstPort),
    }
    
    // 3. Set network layer for checksum calculation
    udpLayer.SetNetworkLayerForChecksum(ipLayer)
    
    // 4. Serialize layers: [IP][UDP][Payload]
    gopacket.SerializeLayers(buffer, opts, ipLayer, udpLayer, gopacket.Payload(payload))
}

DNS Query Generation (src/transport/dns.go):

func (ds *DNSServer) CreateDNSQuery(domain string) ([]byte, uint16, error) {
    // Assign unique query ID
    queryID := ds.queryID
    ds.queryID++
    
    // Build DNS query packet
    dns := &layers.DNS{
        ID:     queryID,
        QR:     false,              // 0 = query
        OpCode: layers.DNSOpCodeQuery,
        RD:     true,               // Recursion desired
        Questions: []layers.DNSQuestion,
    }
    
    // Serialize DNS packet
    return buffer.Bytes(), queryID, nil
}

DNS Response Handling (src/transport/dns.go):

func (ds *DNSServer) HandleDNSResponse(payload []byte) {
    // 1. Parse DNS response
    ip, queryID, err := ds.ParseDNSResponse(payload)
    
    // 2. Find matching pending query
    ds.mu.Lock()
    if ch, ok := ds.pendingQuery[queryID]; ok {
        ch <- ip  // Send resolved IP to waiting goroutine
    }
    ds.mu.Unlock()
}

func (ds *DNSServer) WaitForResponse(queryID uint16) (net.IP, error) {
    ch := make(chan net.IP, 1)
    ds.pendingQuery[queryID] = ch
    
    // Wait for response with timeout
    select {
    case ip := <-ch:
        return ip, nil
    case <-time.After(5 * time.Second):
        return nil, fmt.Errorf("DNS timeout")
    }
}

Complete DNS Workflow:

// 1. Node 1 creates DNS query
dnsPayload, queryID := dnsServer.CreateDNSQuery("www.example.com")

// 2. Wrap in UDP packet
udpPacket := CreateUDPPacket(node1IP, node2IP, 55123, 53, dnsPayload)

// 3. Send through acoustic link to Node 2
node.QueueIPFrame(node2IP, udpPacket)

// 4. Node 2 receives and forwards to internet DNS (1.1.1.1:53)
router.ForwardToInternet(udpPacket)

// 5. Wait for response
ip := dnsServer.WaitForResponse(queryID)

// 6. Use resolved IP for subsequent operations
fmt.Printf("Resolved: %s -> %s\n", domain, ip)

TCP and HTTP Implementation

TCP State Machine (src/transport/tcp.go):

type TCPClient struct {
    SrcIP, DstIP     net.IP
    SrcPort, DstPort layers.TCPPort
    SeqNum, AckNum   uint32
    State            string  // "CLOSED", "SYN_SENT", "ESTABLISHED", "FIN_WAIT"
    DataBuffer       bytes.Buffer
    SendIPPacket     func(destIP net.IP, payload []byte)
}

// 3-Way Handshake
func (c *TCPClient) Connect() error {
    // 1. Send SYN
    c.State = "SYN_SENT"
    c.sendSegment(true, false, false, nil)  // SYN=true, ACK=false, FIN=false
    
    // 2. Wait for SYN-ACK (handled in HandlePacket)
    // 3. Send ACK (handled in HandlePacket when SYN-ACK received)
    
    // Block until ESTABLISHED
    for c.State != "ESTABLISHED" {
        c.cond.Wait()
    }
    return nil
}

TCP Packet Handling (src/transport/tcp.go):

func (c *TCPClient) HandlePacket(ip *layers.IPv4, tcp *layers.TCP) {
    switch c.State {
    case "SYN_SENT":
        // Expecting SYN-ACK
        if tcp.SYN && tcp.ACK {
            c.AckNum = tcp.Seq + 1        // ACK server's SYN
            c.SeqNum = tcp.Ack            // Update our sequence number
            c.State = "ESTABLISHED"
            
            // Send final ACK to complete handshake
            c.sendSegment(false, true, false, nil)  // SYN=false, ACK=true
            c.cond.Broadcast()  // Wake up Connect()
        }
        
    case "ESTABLISHED":
        // Handle incoming data
        if len(tcp.Payload) > 0 {
            // Check sequence number (in-order delivery)
            if tcp.Seq == c.AckNum {
                c.DataBuffer.Write(tcp.Payload)  // Accumulate HTTP response
                c.AckNum += uint32(len(tcp.Payload))
                c.sendSegment(false, true, false, nil)  // Send ACK
            }
        }
        
        // Handle FIN (server closing)
        if tcp.FIN {
            c.AckNum++
            c.sendSegment(false, true, false, nil)  // ACK the FIN
        }
    }
}

HTTP Request Generation (cmd/node/main.go):

func (n *Node) RunCurl(domain string) {
    // 1. Resolve domain name
    n.SendDNSQuery(domain, func(resolvedIP net.IP) {
        // 2. Create TCP connection
        srcPort := uint16(rand.Intn(10000) + 50000)
        tcpClient := transport.NewTCPClient(
            n.ipAddr,              // 172.18.0.1
            resolvedIP,            // 93.184.216.34 (example.com)
            srcPort,               // Random port (e.g., 55123)
            80,                    // HTTP port
            0x12345678,            // Initial sequence number
            n.QueueIPFrame,        // Send function
        )
        
        // 3. Establish TCP connection (3-way handshake)
        if err := tcpClient.Connect(); err != nil {
            fmt.Printf("Connection failed: %v\n", err)
            return
        }
        
        // 4. Send HTTP GET request
        httpRequest := fmt.Sprintf(
            "GET / HTTP/1.0\r\n"+
            "Host: %s\r\n"+
            "User-Agent: Aethernet\r\n"+
            "\r\n",
            domain,
        )
        tcpClient.Write([]byte(httpRequest))  // Sends as PSH+ACK segment
        
        // 5. Wait for HTTP response (15 second timeout)
        time.Sleep(15 * time.Second)
        
        // 6. Display accumulated response
        fmt.Println("\n--- HTTP Response ---")
        fmt.Println(tcpClient.Read())  // Read from DataBuffer
        fmt.Println("---------------------")
        
        // 7. Close TCP connection
        tcpClient.Close()  // Sends FIN
    })
}

TCP Segment Serialization:

func (c *TCPClient) sendSegment(syn, ack, fin bool, data []byte) {
    // Create TCP layer
    tcp := &layers.TCP{
        SrcPort: c.SrcPort,
        DstPort: c.DstPort,
        Seq:     c.SeqNum,
        Ack:     c.AckNum,
        SYN:     syn,
        ACK:     ack,
        FIN:     fin,
        PSH:     len(data) > 0,  // Push flag if sending data
        Window:  65535,
    }
    
    // Create IP layer
    ip := &layers.IPv4{
        SrcIP:    c.SrcIP,
        DstIP:    c.DstIP,
        Protocol: layers.IPProtocolTCP,
        TTL:      64,
    }
    
    // Set network layer for TCP checksum
    tcp.SetNetworkLayerForChecksum(ip)
    
    // Serialize: [IP][TCP][HTTP Data]
    gopacket.SerializeLayers(buffer, opts, ip, tcp, gopacket.Payload(data))
    
    // Send through lower layers
    c.SendIPPacket(c.DstIP, buffer.Bytes())
}

Configuration Parameters

Edit cmd/node/main.go to adjust:

const (
    // Transport layer
    TCPTimeOut    = 15 * time.Second   // TCP response timeout
    CloudflareDNS = "1.1.1.1"          // Upstream DNS server
    
    // Data link layer
    RTTTimeout    = 350 * time.Millisecond  // ACK timeout
    MaxRetries    = 2                        // Max retransmission attempts
    
    // Network layer
    GatewayMAC    = "00-00-5e-00-01-01"     // Gateway MAC address
)

Troubleshooting

DNS Resolution Fails

Verify Node 2 has internet connectivity
Check upstream DNS servers (1.1.1.1, 8.8.8.8) are accessible
Increase DNS timeout if network is slow
Check firewall allows outbound UDP port 53

TCP Connection Timeout

Verify DNS resolution succeeded first
Check Node 2’s NAT is properly forwarding TCP packets
Increase TCPTimeOut if remote server is slow
Verify destination server is reachable (try direct ping first)

HTTP Response Incomplete

Increase TCPTimeOut to allow more time for response
Check acoustic link quality (reduce noise, adjust volume)
Verify TCP sequence numbers are correct
Check for packet loss in lower layers

No Audio I/O

Ensure JACK server is running
Check audio device configuration
Verify audio cables are connected
Test with simple ping first before trying DNS/HTTP

High Packet Loss

Reduce ambient noise
Adjust microphone/speaker volume
Increase buffer size in JACK configuration
Use shorter cable connections when possible

Compiler for a Simple Language in OCaml

2025-09-15T00:00:00+00:00

An end-to-end compiler toolchain for a small C-like language (Oat), implemented in OCaml.

It covers the full pipeline from parsing to LLVM IR and x86 code generation, and includes a set of classic static analyses and optimizations.

What I built

Type system / typechecking
- Implemented a specification-driven typechecker (contexts, subtyping, and full expression/statement checking) to enforce type safety.
- Added structured context building for globals, structs, and function declarations.
Frontend compilation (language features)
- Extended the frontend to support structs and function pointers.
- Implemented array length and array initializers.
- Added checked cast support via null-pointer checking with correct scoping for the “not-null” branch.
Generic dataflow framework
- Implemented a reusable worklist solver (as an OCaml functor) that computes fixpoints over CFG-like graphs.
- Used analysis-specific lattice operations (combine, equality/ordering) and flow functions.
Analyses and optimizations
- Liveness analysis (dataflow) as a baseline analysis and a backend input.
- Alias analysis to determine when a pointer uniquely names a stack slot.
- Dead code elimination (DCE) with store elimination when the destination is proven non-aliased and dead.
- Constant propagation + constant folding, then iterative constprop → dce optimization (O1-style).
Backend: register allocation
- Implemented an improved register allocation strategy leveraging liveness information, outperforming a greedy baseline on provided quality tests.
Experimentation / validation
- Benchmarked multiple configurations (baseline, greedy, better, clang) with and without -O1, using time on representative programs.

Repositories

Optimization + analyses: opt
Typechecking: type-check

LAN-based Simulator

2025-07-02T00:00:00+00:00

A LAN-based Simulator Client implemented in Python, designed to show GNSS data with a live ephemeris table and set the parameters for the simulation.

Layout

Ephemeris Table

GNSS Client

2025-07-01T00:00:00+00:00

A GNSS Client implemented in Python, designed to visualize NMEA 0183 and RTCM 3.3 messages in a dashboard, with support for multiple COM ports.

Dashboard

Details for each COM

Fracture-Fixation-FEA-Simulation

2025-06-12T00:00:00+00:00

A comprehensive finite element analysis (FEA) toolkit for investigating the biomechanical behavior of bone-fixator systems during fracture healing.

This project models stress transfer mechanisms and analyzes how different fixator stiffnesses influence the healing process through computational simulation.

🎯 Project Overview

This project leverages 2D finite element analysis to explore the critical biomechanical phenomenon of “stress transfer” during the bone fracture healing process. The simulation models:

A Bone-Fixator System: Incorporating time-dependent callus maturation to simulate the healing progression.
Stress Redistribution: Analyzing how mechanical stresses are distributed among the bone, fixator, and developing callus tissues.
Fracture Gap Strain Evolution: Tracking changes in strain within the fracture gap under various fixator rigidities.
Parametric Studies: Conducting comparative analyses using flexible, standard, and rigid fixator configurations.

🏗️ Project Structure

The project is organized as follows:

.
├── main.py                      # Main script to run simulations and generate results
├── README.md                    # This README file
├── utils/                       # Directory for core simulation modules
│   ├── fea_core.py              # Core finite element analysis logic
│   ├── materials.py             # Definitions of material properties
│   ├── simulation.py            # Simulation workflow and control
│   ├── analysis.py              # Post-processing: stress and strain calculations
│   └── plot_utils.py            # Utilities for generating plots and visualizations
├── output_advanced/             # Default directory for generated simulation results
│   ├── Flexible/                # Results for the flexible fixator
│   ├── Standard/                # Results for the standard fixator
│   └── Rigid/                   # Results for the rigid fixator
├── assets/                      # Contains model structure figure and generated animations
└── report/                      # LaTeX source documentation

🚀 Quick Start

Prerequisites

Ensure you have Python 3 installed. Then, install the necessary libraries:

pip install numpy matplotlib scipy pandas

Running the Simulation

Execute the main script from the project’s root directory:

python main.py

📊 Key Features

1. Material Modeling

The simulation incorporates distinct material properties:

Bone: Modeled as an elastic material with an Elastic Modulus of 18 GPa and a Poisson’s ratio of 0.3.
Callus: Simulates the healing process with a time-dependent stiffening behavior, where its Elastic Modulus gradually increases from an initial 0.1 GPa up to 18 GPa, matching mature bone.
Fixator: Subject to parametric study with three configurations based on Elastic Modulus:
- Flexible Fixator: 35 GPa (35.0e9 Pa)
- Standard Fixator: 70 GPa (70.0e9 Pa)
- Rigid Fixator: 140 GPa (140.0e9 Pa)

2. Finite Element Analysis

Methodology: Employs 2D plane stress analysis.
Elements: Utilizes 4-node quadrilateral elements for meshing the domain.
Meshing: Features adaptive mesh generation with configurable resolution to balance accuracy and computational cost.
Simulation Steps: A time-stepping simulation (default: 20 steps) models the progression of fracture healing and callus maturation.

3. Analysis Metrics

The simulation tracks and calculates several key biomechanical indicators:

Von Mises Stress: Distribution across all material regions (bone, callus, fixator).
Average Stress: Mean stress values calculated for bone, callus, and fixator components.
Fracture Gap Strain: Evolution of strain within the fracture gap over the healing period.
Stress Shielding Effect: Quantified by comparing stress levels in the bone under different fixator stiffnesses.

4. Visualization

Comprehensive visualization tools are integrated to aid in the interpretation of results:

Stress Contour Plots: Generated for each time step, illustrating stress distribution.
Comparative Stress Evolution Graphs: Plotting average stress in different components over time for each fixator type.
Gap Strain Progression Charts: Showing how fracture gap strain changes throughout the healing simulation.
Parametric Comparison Visualizations: Side-by-side plots for easy comparison of outcomes from different fixator stiffnesses.

📈 Output Files

Upon successful execution, main.py generates the following in the output_advanced/ directory:

summary_final_results.csv: A CSV file summarizing the final average Von Mises stress in the fixator, bone, and callus, along with the final fracture gap strain for each fixator type.
parametric_comparison.png: An image file providing a side-by-side visual comparison of key results across all simulated fixator types.
Individual Simulation Folders (Flexible/, Standard/, Rigid/):
- simulation_results.csv: Time-series data for average stresses and gap strain for that specific simulation run.
- stress_shielding.png: A plot showing the evolution of average stress in bone, callus, and fixator over time.
- gap_strain.png: A plot illustrating the evolution of fracture gap strain over time.
- stress_images/: A directory containing individual frames of stress contour plots, which can be compiled into an animation.

🔬 Results Interpretation

The simulation outputs provide insights into:

Stress Shielding Analysis

High Fixator Stiffness (Rigid Fixator): Tends to bear a larger portion of the mechanical load, leading to increased stress within the fixator itself and potentially reducing the mechanical stimulus (stress) experienced by the bone. This is known as stress shielding.
Low Fixator Stiffness (Flexible Fixator): Allows for more load sharing with the bone and callus, resulting in a more uniform stress distribution. This can promote better bone loading, which is often considered beneficial for healing.

Gap Strain Evolution

Initial Phase: The fracture gap typically experiences high strain levels at the beginning of the healing process when the callus is soft.
Healing Progression: As the callus matures and stiffens, the strain in the fracture gap gradually reduces.
Final Strain Levels: The magnitude of strain at the end of the simulation can be an indicator of the mechanical stability achieved and potentially correlates with healing success. Interfragmentary strain theory suggests optimal ranges of strain for different stages of healing.

📚 Documentation

Detailed technical documentation, theoretical background, and in-depth analysis of results are available in the report/ directory, typically as a LaTeX-generated PDF document.

Deep-learning-Cardiac-Cine-MRI-Segmentation

2025-06-01T00:00:00+00:00

Cardiac Cine MRI Segmentation

Overview

Goal: Segment key cardiac structures – RV, MYO, and LV.
Challenge: Accurate and robust delineation of these structures, which can vary in shape and appearance.
Approach: U-Net based deep learning framework.
1. Baseline U-Net implementation.
2. Impact of removing U-Net skip connections.
3. Effect of data augmentation.
4. Comparison of Binary Cross-Entropy vs. Soft Dice Loss.
5. Improvements with Attention and Hybrid Loss.
Evaluation: Dice Similarity Coefficient (DSC).

Task (a): U-Net (Baseline)

Baseline Training Loss and Validation Loss

Results: Dice Coefficients

Structure	Mean Dice	Std. Dev.
LV	0.9519	0.0086
MYO	0.8734	0.0161
RV	0.8920	0.0310

Segmentation Examples

Baseline Segmentation Example RV

Baseline Segmentation Example MYO

Baseline Segmentation Example LV

Discussion - Baseline

LV Segmentation: Achieved the highest mean Dice score. This is often expected as the LV is typically a large, relatively well-defined structure with good contrast against surrounding tissues in many MRI sequences.
RV Segmentation: Also showed good performance. The RV cavity is usually clearly visible.
MYO Segmentation: Had the lowest mean Dice score. The myocardium is a thinner, more complex structure surrounding the RV, and its boundaries, especially with the RV cavity (endocardium) and epicardium, can be more challenging to delineate accurately, potentially leading to lower overlap scores.
The standard deviations are relatively small, indicating consistent performance across the test slices.

Task (b): U-Net without Skip Connections

Modification: No skip connections in the U-Net architecture.
Training: Same as baseline (BCE Loss, lr=0.01, 50 epochs).
Purpose: Evaluate the importance of skip connections.

Training Loss and Validation Loss (No Skip Connections)

Training and Validation Loss for Baseline U-Net without skip connections.

Results: Dice Coefficients

Structure	Baseline DSC	No Shortcut DSC
LV Mean	0.9519	0.9260
MYO Mean	0.8734	0.8223
RV Mean	0.8920	0.8588
LV std	0.0086	0.0111
MYO std	0.0161	0.0168
RV std	0.0310	0.0296

Discussion - Impact of No Skip Connections

Significant Drop in Performance: All structures showed a noticeable decrease in DSC.
Reason: Skip connections provide high-resolution spatial information from the encoder to the decoder, crucial for accurate boundary localization. They also aid gradient flow.
Conclusion: Skip connections are vital for U-Net’s segmentation accuracy in this task.

Task (c): U-Net with Data Augmentation

Network: Baseline U-Net architecture.
Augmentations (Training Set Only):
RandomHorizontalFlip
RandomRotation(15°)
RandomAffine(degrees=50, translate=(0.1,0.1), scale=(0.9,1.1), shear=5)
Implementation: SegmentationDataset ensuring identical transforms for image and mask.
Training: BCE Loss, lr=0.01, 50 epochs.

Training Loss and Validation Loss (with Data Augmentation)

Training and Validation Loss for Baseline U-Net with Data Augmentation.

Results: Dice Coefficients

Structure	Baseline DSC	Data Aug. DSC
LV Mean	0.9519	0.9276
MYO Mean	0.8734	0.8469
RV Mean	0.8920	0.8635
LV std	0.0086	0.0107
MYO std	0.0161	0.0149
RV std	0.0310	0.0384

Discussion - Impact of Data Augmentation

DSC Decrease: The specific augmentation strategy led to slightly lower Dice scores.
Possible Reasons:
Some augmentations could have distorted anatomical structures, reducing the effectiveness of learning precise boundaries. Maybe the relative location of structures was altered too much.
Conclusion: The relative location of structures is crucial for segmentation tasks, and the specific augmentations used may not have been beneficial for this dataset. More careful selection or tuning of augmentations is needed.

Task (d): U-Net with Soft Dice Loss

Network: Baseline U-Net architecture.
Training Data: Original Non-Augmented Training Set.
Loss Function: SoftDiceLoss.
Optimizer: Adam (lr=0.001), ExponentialLR scheduler.
Training: 50 epochs.

Training Loss and Validation Loss (With Soft Dice Loss)

Training and Validation Loss for Baseline U-Net with Soft Dice Loss.

Results: Dice Coefficients

Structure	Baseline with BCE Loss	Baseline with Soft Dice Loss
LV Mean	0.9519	0.9566
MYO Mean	0.8734	0.8962
RV Mean	0.8920	0.8998
LV std	0.0086	0.0100
MYO std	0.0161	0.0100
RV std	0.0310	0.0371

Results: Accuracy

Structure	Baseline with BCE Loss	Baseline with Soft Dice Loss
LV Accuracy Mean	0.9991	0.9992
MYO Accuracy Mean	0.9977	0.9980
RV Accuracy Mean	0.9983	0.9983
LV Accuracy std	0.0002	0.0002
MYO Accuracy std	0.0003	0.0002
RV Accuracy std	0.0005	0.0006

Segmentation Examples (Soft Dice Loss)

Baseline with Soft Dice Loss Segmentation Example RV

Baseline with Soft Dice Loss Segmentation Example MYO

Baseline with Soft Dice Loss Segmentation Example LV

Discussion - Soft Dice Loss

Segmentation Accuracy (Dice): Using Soft Dice Loss resulted in noticeably better Dice coefficients for all cardiac structures compared to BCE Loss when trained on the same non-augmented data. The improvement for MYO was particularly significant.
Segmentation Accuracy (Pixel-wise): Pixel-wise accuracy also showed slight improvements or remained comparable at very high levels.
Conclusion for Task (d): Changing the training loss from cross-entropy (BCE) to Soft Dice Loss improved overall segmentation accuracy, especially when evaluated by the Dice coefficient, which is more sensitive to segmentation overlap.

Task (e): Improvements

This section explores two main improvements: using an Attention U-Net and employing a Hybrid Loss function.

Attention U-Net

Advanced UNet (Attention U-Net):
Architecture: Introduced AttentionBlock in the decoder’s Up module. * AttentionBlock: Computes attention coefficients by combining features from the decoder (gating signal) and encoder (skip connection), then applies these coefficients to the encoder features. This helps the model focus on relevant spatial regions during upsampling.
Loss Function: BCE Loss.
Optimizer: Adam (lr=0.001), ExponentialLR scheduler.
Training: 50 epochs.

Attention U-Net Architecture Diagram

Attention U-Net Architecture

Results: Dice Coefficients (Attention U-Net)

Structure	Baseline with BCE Loss	Baseline with Soft Dice Loss	Attention U-Net
LV Mean	0.9519	0.9566	0.9568
MYO Mean	0.8734	0.8962	0.8963
RV Mean	0.8920	0.8998	0.9029
LV std	0.0086	0.0100	0.0095
MYO std	0.0161	0.0100	0.0120
RV std	0.0310	0.0371	0.0370

Segmentation Examples (Attention U-Net)

Attention U-Net Segmentation Example RV

Attention U-Net Segmentation Example MYO

Attention U-Net Segmentation Example LV

Discussion - Attention U-Net

The Attention U-Net showed improved Dice scores compared to the baseline U-Net with BCE loss and the one with Soft Dice Loss. RV performance was slightly higher than the baseline with Soft Dice Loss and the BCE baseline.
This suggests that the attention mechanism effectively helps the model to focus on more complex structures or finer details, leading to better boundary delineation for certain structures.
Accuracy scores are very high across all structures, which is common in segmentation tasks with large background areas. Dice coefficient remains a more informative metric for evaluating overlap.

HybridLoss

Motivation: To further improve segmentation, especially at boundaries and for complex structures, by combining multiple complementary loss objectives. This aims to leverage the strengths of different loss types for a more holistic optimization.

HybridLoss Definition

The HybridLoss adaptively weights four distinct loss components:

Dice Loss (Overlap)
Binary Cross-Entropy (BCE) Loss (Pixel-wise accuracy)
Boundary Loss (Edge definition)
Hausdorff Distance Loss (Approximation) (Shape similarity)
- Features adaptive weighting of these components using learnable uncertainty parameters.

Results with HybridLoss (Mean Dice Scores)

Model	LV Dice (SD)	MYO Dice (SD)	RV Dice (SD)
UNet + HybridLoss	0.9504 (0.0276)	0.8839 (0.0275)	0.9061 (0.0573)
Baseline U-Net	0.9519	0.8734	0.8920
AttUNet + HybridLoss	0.9507 (0.0235)	0.8875 (0.0247)	0.9033 (0.0703)
Attention U-Net	0.9568 (0.0095)	0.8963 (0.0120)	0.9029 (0.0370)

Overall Performance Summary (Mean Dice Coefficients)

Model Configuration	LV Mean DSC	MYO Mean DSC	RV Mean DSC
(a) Baseline U-Net (BCE)	0.9519	0.8734	0.8920
(b) U-Net No Shortcut (BCE)	0.9260	0.8223	0.8588
(c) U-Net + Data Aug. (BCE)	0.9276	0.8469	0.8635
(d) U-Net (Soft Dice Loss, No Aug.)	0.9566	0.8962	0.8998
(e0) AttUNet (BCE)	0.9568	0.8963	0.9029
(e1) UNet + HybridLoss	0.9504	0.8839	0.9061
(e2) AttUNet + HybridLoss	0.9507	0.8875	0.9033

Overall Discussion

The AttUNet with BCE (e0) remains the top performer for LV and MYO segmentation.
Models utilizing UNet + HybridLoss (e1), achieved the best RV Dice score.
No Universal Superiority: HybridLoss, despite its sophisticated multi-component design with adaptive weighting, did not prove to be a universally superior loss function in these experiments.
RV Segmentation Strength: A consistent observation is the relative strength of HybridLoss (or its components) in improving or maintaining high performance for RV segmentation, even when LV/MYO performance drops.

Conclusion

Best Overall Performance (Structure-wise):
LV & MYO: AttUNet with BCE (Task e, Attention U-Net part; e0 in summary) shows the highest Dice scores.
RV: U-Net with HybridLoss (Task e, HybridLoss part; e1 in summary) achieves the best Dice score.
Complexity vs. Simplicity: A simpler model (U-Net) with a well-chosen, targeted loss function (Soft Dice Loss) can still be highly effective and may outperform more complex loss formulations on certain structures or metrics.
The performance of HybridLoss models suggests that further optimization (e.g., training duration, hyperparameter tuning of the loss components or solver) could potentially lead to even better results.

Deep-Learning-Dynamic-MRI-Reconstruction

2025-05-01T00:00:00+00:00

Dynamic MRI Reconstruction

Fig: Overall architecture of our proposed reconstruction network with dual UNet branches for real and imaginary components and 3D ResNet for temporal fusion

This is a repository for the project “Deep Learning for Dynamic MRI Reconstruction” as part of the course BME1312 Artificial Intelligence in Biomedical Imaging at ShanghaiTech University. The project focuses on using deep learning techniques to reconstruct dynamic MRI images from undersampled data.

This project uses deep learning to reconstruct high-quality dynamic MRI images from undersampled data. We propose a deep-learning-based denoising framework combining two independent UNet modules and a 3D ResNet to explore the temporal correlation. We generate variable density undersampling patterns with acceleration factor 5 and 11 central k-space lines, analyze the resulting aliasing artifacts, and evaluate reconstruction performance with PSNR and SSIM metrics. Additionally, we investigate the effects of dropout, dynamic learning rate schedules and compare L1 versus L2 losses.

TO START

Clone the repository and download the dataset from here

Our dataset cine. npz is a fully sampled cardiac cine MR image with the size of [nsamples, nt, nx, ny] where nsamples is the number of samples, nt is the number of frames, nx and ny are the dimensions of the image.

Install the required packages:
```
pip install -r requirements.txt
```
Run the training script:
```
python train.py output
```

After that, you can find the undersampled images, reconstructed images in the image folder, and the training log in the output folder. We also provide the full sampling images and both real and imaginary parts of the UNet-reconstructed images in the image folder for reference.

Analyze the results by comparing the reconstructed images with the original images. You can use metrics like PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity Index) to evaluate the quality of the reconstructions, both of them are provided in the output.txt file.

Variable Density Random Undersampling Pattern Generation

We generate a variable density random undersampling pattern (U) with the size of the given cine images for acceleration factor of 5. Eleven central k-space lines are sampled for each frame. Each sampling pattern must be a matrix with 1s in the sampled positions and 0s in the remaining ones.

We also plot the undersampling mask for one dynamic frame and undersampling masks in the ky-t dimension.

We also obtained the aliased images as a result of the undersampling process with the generated patterns. For this we use the formula:

\[b = F^{-1} \cdot U \cdot F \cdot m\]

where $b$ is the aliased image, $F$ is the Fourier transform, $U$ is the undersampling mask, and $m$ is the original image. The aliased images are then used as input to the deep learning model for reconstruction.

Below are some examples of the aliased images generated from the original images.

Fig: Aliased image resulting from 5x undersampling of the cardiac MRI data

And here are the comparison of the aliased images with the original images. We also show the sampling masks for some frames. It is noticeable that different frames have different sampling masks, which is a key feature of our approach to Deep Learning based reconstruction.

Fig: Comparison between fully sampled (left), undersampled (middle), and corresponding sampling mask (right) for frames

Fig: Multiple sampling masks showing the variable density patterns across different temporal frames

It is also clear to see that, for different dynamic frames, the undersampling masks are different.

Reconstruction Network

All the details of the network are in the train.py file.

To explore the temporal correlation, we chose to stack the dynamic images along the channel dimension. However, this brought out a problem as the input image is pseudo-complex, and the real and imaginary parts are not aligned. To solve this, we split the input into two branches, one for the real part and one for the imaginary part. The two branches are then concatenated at the end of the UNet structure. We added attention mechanisms to the bottleneck layer of the UNet structure to better capture the spatial correlation and channel correlation.

However, UNet is a 2D structure, and the temporal correlation is not well captured. To solve this, we added a 3D ResNet structure after the UNet structure to better achieve this goal.

So in general, the reconstruction network consists of three components:

Dual 2D UNet Architecture (Real & Imaginary Components)

Purpose: Process the real and imaginary parts of the complex MRI data

Features:

Encoder-decoder structure with skip connections
Attention mechanism in the bottleneck layer
Dropout (p=0.3) for regularization
LeakyReLU activation (negative_slope=0.1)
Weight Regularization for better training stability
Channel and spatial attention modules

3D ResNet (Temporal Fusion)

Purpose: Integrate temporal information across the MRI sequence

Features:

3D convolutions to process the temporal dimension
Residual connections for better gradient flow
Lightweight design with one residual block per layer
Final 1×1×1 convolution to map features to output channels

The whole structure is shown in the figure below.

Fig: Detailed architecture of our reconstruction network showing dual UNet branches for processing real and imaginary components separately, followed by a 3D ResNet for temporal fusion across frames

Training and Evaluation

Below are the details of the training parameters:

train(in_channels=20,
      out_channels=20,
      init_features=64,
      num_epochs=800,
      weight_decay=1e-4,
      batch_size=10,
      initial_lr=1e-4,
      loss_tpe='L2'
    )

Using the above parameters, we achieved a PSNR of 29.08446121 and SSIM of 0.84434632, which is a remarkable improvement over the aliased images. The whole training process took about 2 hours on a single NVIDIA RTX 2080 Ti GPU. More detailed results can be found in the output.txt file. We are also happy to show you some of the reconstructed images compared to the original images.

Fig 1: Reconstructed cardiac MRI image using our deep learning model

Fig 2: Fully sampled reference cardiac MRI image (ground truth)

Fig 3: Another view of the reconstructed cardiac MRI image

Fig 4: Corresponding fully sampled reference image for comparison

Discussion on the Effect of Dropout, Dynamic Learning Rate Schedules, and Loss Functions

We investigated the impact of several training components on the reconstruction performance: dropout, dynamic learning rate schedules, and the choice of loss function (L1 vs. L2).

Impact of Dropout and Dynamic Learning Rate

We trained a model variant without dropout and without a dynamic learning rate schedule (using a constant learning rate). The training and validation loss curves are shown below:

Fig: Training and Validation Loss Curves without Dropout and Dynamic Learning Rate

Performance Metrics (No Dropout/Dynamic LR):

Loss: mean = 0.00343, std = 0.00127
PSNR: mean = 24.154, std = 1.858
SSIM: mean = 0.743, std = 0.037

Analysis:

The validation loss curve shows significant fluctuations and a tendency to increase towards the end of training, indicating overfitting. Compared to the original model (PSNR: 29.08, SSIM: 0.844), the performance is considerably lower. This highlights the importance of dropout for regularization and dynamic learning rate schedules for stable convergence and avoiding overfitting.

Impact of L1 vs. L2 Loss Function

We trained another model variant using the L1 loss function instead of the default L2 loss, while keeping dropout and the dynamic learning rate schedule active.

Fig: Training and Validation Loss Curves using L1 Loss

Performance Metrics (L1 Loss):

Loss: mean = 0.02228, std = 0.00549
PSNR: mean = 29.151, std = 2.241
SSIM: mean = 0.84389, std = 0.042

Performance Metrics (Original L2 Loss):

Loss: mean = 0.00135, std = 0.00055
PSNR: mean = 29.084, std = 1.932
SSIM: mean = 0.84434, std = 0.037

Analysis:

The L1 loss values are inherently larger than L2 loss values, which is reflected in the mean loss. However, the PSNR and SSIM achieved with L1 loss are very similar to those achieved with L2 loss.

In our experiments on dynamic MRI reconstruction, we observed an interesting fact that using L1 loss led to higher PSNR but lower SSIM compared to using other loss functions.

This phenomenon can be explained as follows:

PSNR (Peak Signal-to-Noise Ratio) measures pixel-wise accuracy and is closely related to the mean squared error (MSE). Although L1 loss optimizes absolute error instead of squared error, it still effectively reduces the overall pixel-level discrepancy, thereby improving PSNR.
SSIM (Structural Similarity Index) evaluates the structural similarity between images, focusing on local patterns of luminance, contrast, and structure. While L1 loss minimizes the global error, it does not explicitly encourage structural consistency. As a result, even small spatial misalignments or distortions — which may have little impact on PSNR — can lead to a noticeable drop in SSIM.

In short, L1 loss favors pixel-wise precision but may compromise local structural integrity, which explains the observed trade-off between higher PSNR and lower SSIM.

To address this, one potential solution is to design a composite loss function that balances pixel accuracy and structural preservation, such as combining L1 loss with SSIM loss or perceptual loss based on feature space similarity.

The training curve appears stable. While L1 loss can sometimes promote sparsity, L2 loss often leads to smoother results and is more sensitive to large errors. In this case, both loss functions yield comparable high-quality reconstructions, but the original configuration with L2 loss achieved slightly better stability (lower standard deviation in metrics) and significantly lower loss values.

Ultimately, the choice of loss function should be guided by the final objective of the task:

If precise pixel recovery is the goal, prioritizing PSNR with L1 or L2 loss may suffice.
If maintaining perceptual quality and structural fidelity is crucial (e.g., in clinical imaging), incorporating structure-aware losses is strongly recommended.

Creativity: Comparison with Attention Mechanisms

To evaluate the impact of attention mechanisms, we incorporated channel attention and spatial attention modules into the baseline network. These modules were added to the encoder-decoder structure of the UNet branches to enhance feature representation by focusing on the most informative regions and channels.

The performance metrics for the attention-based model are compared with the baseline (optimized with dropout and dynamic LR) in the table below.

Model	PSNR mean	PSNR std dev	SSIM mean	SSIM std dev
Baseline (Optimized)	29.08446121	1.93235576	0.84434632	0.03711063
Attention-Based	32.38721751	1.87722389	0.89100512	0.03601302

Discussion

PSNR Improvement: The attention-based model achieved a higher mean PSNR compared to the baseline, indicating better pixel-wise reconstruction accuracy.
SSIM Improvement: The mean SSIM also improved, suggesting that the attention mechanisms helped preserve structural details more effectively.
Stability: The standard deviations for both PSNR and SSIM were slightly lower in the attention-based model, indicating more consistent performance across the test set.

Conclusion:

The experiments demonstrate that dropout and dynamic learning rate schedules are crucial for achieving optimal performance and preventing overfitting. Both L1 and L2 loss functions can lead to good results, with L2 providing slightly more stable performance metrics in our final configuration. The original setup (with dropout, dynamic LR, and L2 loss) provided the best balance of performance and stability.

Unrolled Denoising Network with Data Consistency Layer

We explored an unrolled network architecture incorporating data consistency layers between cascaded instances of our base reconstruction network. This approach aims to iteratively refine the reconstruction by enforcing consistency with the acquired k-space data after each denoising step.

We trained models with 2 cascades (Cascade 2) and 3 cascades (Cascade 3).

Cascade 2 Results

Training: Required 18GB GPU memory, trained for 300 epochs. Average time per epoch: ~12 seconds.
Performance Metrics:
- Loss: mean = 0.00143, std = 0.00057
- PSNR: mean = 28.866, std = 2.048
- SSIM: mean = 0.834, std = 0.030

Cascade 3 Results

Training: Required 24GB GPU memory, trained for 300 epochs. Average time per epoch: ~360 seconds (6 minutes).
Performance Metrics:
- Loss: mean = 0.00137, std = 0.00051
- PSNR: mean = 28.958, std = 1.813
- SSIM: mean = 0.807, std = 0.041

Fig: Training and Validation Loss Curves for the 3-Cascade Unrolled Network

Comparison and Discussion

Model	Epochs	Avg Epoch Time	GPU Mem	Loss (mean ± std)	PSNR (mean ± std)	SSIM (mean ± std)
Original	800	~6 sec	~10GB	0.00135 ± 0.00055	29.084 ± 1.932	0.844 ± 0.037
Cascade 2	300	~12 sec	18GB	0.00143 ± 0.00057	28.866 ± 2.048	0.834 ± 0.030
Cascade 3	300	~360 sec	24GB	0.00137 ± 0.00051	28.958 ± 1.813	0.807 ± 0.041

Observations:

Increasing the number of cascades significantly increased GPU memory requirements and training time per epoch.
Both cascaded models were trained for only 300 epochs due to time and resource constraints, compared to 800 epochs for the original model.
The performance (PSNR, SSIM) of the cascaded models did not surpass the original single network. Cascade 3 showed slightly better PSNR than Cascade 2 but worse SSIM than both Cascade 2 and the original model. The validation loss for Cascade 3 also appears less stable than the original model’s later epochs.

Potential Reasons for Limited Improvement:

Insufficient Training Data: The dataset size (200 samples) might be insufficient, especially for deeper unrolled networks. Data augmentation using more undersampling masks could potentially help.
Base Network Complexity: The original network is relatively large. It might not have fully converged even after 800 epochs, potentially reaching a local minimum sufficient for good performance but hindering effective iterative refinement in the unrolled setting.
Training Constraints: Limited training epochs (300 vs. 800) due to significantly increased training time and memory usage likely prevented the cascaded models from reaching their full potential convergence.

Further investigation with more extensive data augmentation, potentially a smaller base network, and longer training times would be needed to fully evaluate the potential of the unrolled architecture.

Renyi Yang

Useless Machine - Arduino

Project Overview

Features

Keypad login

Useless machine mode

DC motor mode

Motor control graph examples

Media

Demo videos

Highlights

Keypad

DC Motor

Useless Machine

Overall

Hardware / Software

How To Run

Suggested Workflow

Notes

Image Processing

Sessions Overview

Key Concepts

Quick Start

Repository Structure

Technologies Used

AMD HeterOCR

AMD HeterOCR

CIIE 2025

A Go-based acoustic communication system

Table of Contents

Overview

Core Features

UDP DNS Resolution

TCP HTTP Implementation

Lower Layers (Brief)

Architecture

Prerequisites

Installation

Environment Setup

1. Start JACK Server

2. Install Microsoft Loopback Adapter

3. Configure Network Interfaces

4. Get Network Interface Name

Usage

DNS Query Example

HTTP Request Example

Example Domain

Project Structure

Implementation Details

UDP and DNS Implementation

TCP and HTTP Implementation

Configuration Parameters

Troubleshooting

DNS Resolution Fails

TCP Connection Timeout

HTTP Response Incomplete

No Audio I/O

High Packet Loss

Compiler for a Simple Language in OCaml

What I built

Repositories

LAN-based Simulator

Layout

Ephemeris Table

GNSS Client

Dashboard

Details for each COM

Fracture-Fixation-FEA-Simulation

🎯 Project Overview

🏗️ Project Structure

🚀 Quick Start

Prerequisites

Running the Simulation

📊 Key Features

1. Material Modeling

2. Finite Element Analysis

3. Analysis Metrics

4. Visualization

📈 Output Files

🔬 Results Interpretation