The Future of Models.
Built by Hand.

Your portfolio is loading.
OPEN TO EXPLORE.

10%

Voice of the Portfolio

Learn more

Model

showcasing language, vision, and finance models built with care, experimentation, and shipping discipline

Portfolio

SCROLL TO EXPLORE PROJECTS

SHERWIN builds models from scratch, fine-tunes and ships them publicly.

What You’re Seeing

This portfolio collects the projects I am proudest of: a custom GPT-style language model trained on FineWeb-Edu, a Llama 3.1 fine-tune, a class-conditional Fashion-MNIST diffusion model, and a BTC forecasting app.

WHY THIS PORTFOLIO EXISTS

Each project shows a different part of the stack: language modeling, generative vision, and predictive finance. Together they map how I think about training, experimentation, and deployment.

FROM SCRATCH

I designed and trained a GPT-style transformer in PyTorch, including masked self-attention, causal generation, and sampling utilities.

FINE-TUNING

I adapted Llama 3.1 into Dark_Llama_f16 with LoRA and Unsloth, showing efficient model adaptation and release discipline.

GEN VISION

I built a class-conditional diffusion model for Fashion-MNIST, pairing label embeddings with timestep conditioning to generate images from noise.

FORECASTING

I packaged a Bitcoin price prediction workflow into a Streamlit app backed by a saved TensorFlow model, historical BTC-USD data, and multi-day forecasts.

TRAINING PIPELINES

From tokenization to evaluation, I keep the training loop reproducible so experiments can be compared and improved with confidence.

SHIPPING & SHARE

I publish models on Hugging Face and wrap applied projects in interfaces that make the result easy to inspect and use.

Inside the Portfolio

A portfolio of models and experiments across text generation, diffusion, and financial forecasting. Each piece documents a different part of the build process.

HOVER ON SKILLS

FROM-SCRATCH GPT
DARK_LLAMA_F16
FASHION DIFFUSION
BTC FORECASTING
FINEWEB_EDU_GPT_100M
HUGGING FACE RELEASES

From-Scratch GPT

Custom GPT-style transformer built in PyTorch and trained on FineWeb-Edu

Dark_Llama_f16

Llama 3.1 fine-tune released in a portable GGUF / Transformers-friendly format

Fashion-MNIST Diffusion

Class-conditional diffusion with timestep and label conditioning

BTC Predictor

Streamlit forecasting app for BTC-USD with a saved TensorFlow model

Qwen_3b_medical_o1_reasoning

Medical reasoning fine-tune with Unsloth, LoRA, and structured clinical analysis

fineweb_edu_gpt_100m

163M-param GPT trained from scratch on 100M FineWeb-Edu tokens with PyTorch

Intelligence

Model

What These Projects Cover

Language, vision, and finance each get their own experiment here: one model trained from scratch, one fine-tuned LLM, one diffusion system, and one applied BTC forecasting app.

Dark_Llama_f16

A Llama 3.1 fine-tune with LoRA and Unsloth, focused on efficient adaptation and conversational generation.

gpt-124m-fineweb-edu-10m-tokens

A custom 124M GPT-style transformer written from scratch in PyTorch and trained on 10M FineWeb-Edu tokens.

fashion_mnist_diffusion_class_conditional

A class-conditional UNet diffusion model for Fashion-MNIST with timestep and label conditioning.

Qwen_3b_medical_o1_reasoning

A 3B medical reasoning model fine-tuned with Unsloth and LoRA for structured clinical analysis.

fineweb_edu_gpt_100m

A 163M-parameter GPT-style transformer trained from scratch on 100M FineWeb-Edu tokens. 12 layers, 12 heads, 768 embedding dim. Built with PyTorch and AdamW on an RTX 3060 Ti.

Project Questions

The parts people usually ask about

/* Animate Accordion Bottom Grid */

These notes explain the scope of the work, how the models were built, and where each project lives.

What is this portfolio?

It is a model portfolio that brings together my language, vision, and forecasting work in one place.
What makes these projects worth showing?

They cover the full build process: dataset choice, architecture design, training, fine-tuning, evaluation, packaging, and deployment.
Where are the models hosted?

The models are published on Hugging Face, and the BTC predictor is packaged as a Streamlit app for interactive use.
How should someone contact you?

Use my portfolio contact link: https://artificialwizard.in/
Is the portfolio finished?

Not quite. The portfolio will keep growing as I add more models, experiments, and polished demos.

Model

Portfolio

SHERWIN builds models from scratch, fine-tunes and ships them publicly.

FROM SCRATCH

FINE-TUNING

GEN VISION

FORECASTING

TRAINING PIPELINES

SHIPPING & SHARE

From-Scratch GPT

Dark_Llama_f16

Fashion-MNIST Diffusion

BTC Predictor

Qwen_3b_medical_o1_reasoning

fineweb_edu_gpt_100m

Intelligence

Model

Dark_Llama_f16

gpt-124m-fineweb-edu-10m-tokens

fashion_mnist_diffusion_class_conditional

Qwen_3b_medical_o1_reasoning

fineweb_edu_gpt_100m

What is this portfolio?

What makes these projects worth showing?

Where are the models hosted?

How should someone contact you?

Is the portfolio finished?

Model Atlas

by Sherwin Roger