Model

A lightly post-trained version of talkie-lm/talkie-1930-13b-base via some initial SFT on an elicited persona followed by KTO on Claude-preferred responses. Distinct from the official instruction tune in that it is instructed to play the character of an intelligent machine, tuned to have slightly more modern-day preferences (so it may adopt the views of a 1930s progressive), and finally differs in its chat template, which forgoes XML to instead present itself as a transcript/play.

Recommended sampling settings are temp=0.5, min_p=0.05, top_k=40, repetition_penalty=1.2, repetition_context_size=128. Like the base model, it has a max context size of 2048. It additionally retians the (limited) few shot learning ability of the base model - going from 7.73% GSM8K at 1-shot to 11.30% at 2-shot to 12.36% at 4-shot.

The seed prompts used for initial persona elicitationcan be found here - each 'seed' was then continued by base talkie into a multi-turn conversation, the best of which formed the initial SFT round.

Safety Disclaimer

This model is ancient and safety training is minimal. It may generate objectionable content, hateful content, or flat-up wrong content. Thus it is strictly for experimental purposes and I strongly recommend against its use in any user-facing application.

Chat template

This model uses a standard user / assistant chat API surface, but renders messages into TALKIE's play transcript format:

messages = [
    {"role": "user", "content": "How are you?"},
]
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

This produces:

The following conversation took place between the HUMAN, and TALKIE - a mechanical mind imbued with the knowledge of the world and the ability to use human language - a "thinking machine". It is published here for the benefit of the public:

HUMAN:

How are you?

TALKIE:

Generation stops

Stop generation when TALKIE starts a new speaker turn. For direct MLX generation, this model treats both <|endoftext|> (65535) and the HUM token (56180) as EOS. That makes bare mlx_lm.generate and mlx_lm.batch_generate stop when the model begins to write the next HUMAN: turn.

The model package also includes generation_config.json with these stop strings for runtimes that support text stops:

[
  "\n\nHUMAN:",
  "\nHUMAN:",
  "HUMAN:",
  "\n\nTALKIE:",
  "\nTALKIE:",
  "TALKIE:"
]

When serving through mlx_lm.server, pass the same strings as the request stop field.

Downloads last month
552
Safetensors
Model size
13B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for N8Programs/talkie-box

Finetuned
(3)
this model