--- license: mit language: - code tags: - transformer - code-translation - xlcost - from-scratch --- # C++ → Python Transformer (16.4M params) Encoder-decoder transformer trained from scratch for C++ → Python code translation. Trained on XLCoST on a single GTX 1650 4 GB GPU. Best checkpoint at epoch 19, val_loss **2.0474**. ## Architecture - 4 encoder + 4 decoder layers, pre-norm - d_model 256, 8 heads, d_ff 512 - Sinusoidal positional encoding - Greedy decoding at inference - 16.4M parameters ## Files - `best_model.pt` — full PyTorch checkpoint (model state, optimizer state, src/tgt vocabularies). 189 MB. ## Load ```python from model import build_transformer import torch ckpt = torch.load('best_model.pt', map_location='cpu') model = build_transformer( src_vocab_size=len(ckpt['src_vocab']), tgt_vocab_size=len(ckpt['tgt_vocab']), src_seq_len=300, tgt_seq_len=300, d_model=256, N=4, h=8, dropout=0.0, d_ff=512, ) model.load_state_dict(ckpt['model_state']) model.eval() ``` Full training and inference code: [github.com/debtirthasaha/cpp-to-python-transformer](https://github.com/debtirthasaha/cpp-to-python-transformer) ## Writeup [A transformer that reads C++ and writes Python](https://debtirthasaha.github.io/blog/2026/cpp-to-python-transformer/)