
Learn more
showcasing language, vision, and finance models built with care, experimentation, and shipping discipline
I designed and trained a GPT-style transformer in PyTorch, including masked self-attention, causal generation, and sampling utilities.
I adapted Llama 3.1 into Dark_Llama_f16 with LoRA and Unsloth, showing efficient model adaptation and release discipline.
I built a class-conditional diffusion model for Fashion-MNIST, pairing label embeddings with timestep conditioning to generate images from noise.
I packaged a Bitcoin price prediction workflow into a Streamlit app backed by a saved TensorFlow model, historical BTC-USD data, and multi-day forecasts.
From tokenization to evaluation, I keep the training loop reproducible so experiments can be compared and improved with confidence.
I publish models on Hugging Face and wrap applied projects in interfaces that make the result easy to inspect and use.
FROM-SCRATCH GPT
DARK_LLAMA_F16
FASHION DIFFUSION
BTC FORECASTING
FINEWEB_EDU_GPT_100M
HUGGING FACE RELEASES
Custom GPT-style transformer built in PyTorch and trained on FineWeb-Edu
Llama 3.1 fine-tune released in a portable GGUF / Transformers-friendly format
Class-conditional diffusion with timestep and label conditioning
Streamlit forecasting app for BTC-USD with a saved TensorFlow model
Medical reasoning fine-tune with Unsloth, LoRA, and structured clinical analysis
163M-param GPT trained from scratch on 100M FineWeb-Edu tokens with PyTorch
A Llama 3.1 fine-tune with LoRA and Unsloth, focused on efficient adaptation and conversational generation.
A custom 124M GPT-style transformer written from scratch in PyTorch and trained on 10M FineWeb-Edu tokens.
A class-conditional UNet diffusion model for Fashion-MNIST with timestep and label conditioning.
A 3B medical reasoning model fine-tuned with Unsloth and LoRA for structured clinical analysis.
A 163M-parameter GPT-style transformer trained from scratch on 100M FineWeb-Edu tokens. 12 layers, 12 heads, 768 embedding dim. Built with PyTorch and AdamW on an RTX 3060 Ti.
It is a model portfolio that brings together my language, vision, and forecasting work in one place.
They cover the full build process: dataset choice, architecture design, training, fine-tuning, evaluation, packaging, and deployment.
The models are published on Hugging Face, and the BTC predictor is packaged as a Streamlit app for interactive use.
Use my portfolio contact link: https://artificialwizard.in/
Not quite. The portfolio will keep growing as I add more models, experiments, and polished demos.