Telegram Group & Telegram Channel
⚡️ Biggest open text dataset release of the year: SmolTalk is a 1M sample big synthetic dataset that was used to train SmolLM v2.

TL;DR;
🧩 New datasets: Smol-Magpie-Ultra (400K) for instruction tuning; Smol-contraints (36K) for precise output; Smol-rewrite (50K) & Smol-summarize (100K) for rewriting and summarization.
🤝 Public Dataset Integrations: OpenHermes2.5 (100K), MetaMathQA & NuminaMath-CoT, Self-Oss-Starcoder2-Instruct, LongAlign & SystemChats2.0
🥇 Outperforms the new Orca-AgenInstruct 1M when trained with 1.7B and 7B models
🏆 Outperform models trained on OpenHermes and Magpie Pro on IFEval and MT-Bench
distilabel to generate all new synthetic datasets
🤗 Released under Apache 2.0 on huggingface

Apache 2.0

Synthetic generation pipelines and training code released.

Dataset: https://huggingface.co/datasets/HuggingFaceTB/smoltalk
Generation Code: https://github.com/huggingface/smollm
Training Code: https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm2

@Machine_learn



tg-me.com/Machine_learn/3092
Create:
Last Update:

⚡️ Biggest open text dataset release of the year: SmolTalk is a 1M sample big synthetic dataset that was used to train SmolLM v2.

TL;DR;
🧩 New datasets: Smol-Magpie-Ultra (400K) for instruction tuning; Smol-contraints (36K) for precise output; Smol-rewrite (50K) & Smol-summarize (100K) for rewriting and summarization.
🤝 Public Dataset Integrations: OpenHermes2.5 (100K), MetaMathQA & NuminaMath-CoT, Self-Oss-Starcoder2-Instruct, LongAlign & SystemChats2.0
🥇 Outperforms the new Orca-AgenInstruct 1M when trained with 1.7B and 7B models
🏆 Outperform models trained on OpenHermes and Magpie Pro on IFEval and MT-Bench
distilabel to generate all new synthetic datasets
🤗 Released under Apache 2.0 on huggingface

Apache 2.0

Synthetic generation pipelines and training code released.

Dataset: https://huggingface.co/datasets/HuggingFaceTB/smoltalk
Generation Code: https://github.com/huggingface/smollm
Training Code: https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm2

@Machine_learn

BY Machine learning books and papers




Share with your friend now:
tg-me.com/Machine_learn/3092

View MORE
Open in Telegram


Machine learning books and papers Telegram | DID YOU KNOW?

Date: |

Unlimited members in Telegram group now

Telegram has made it easier for its users to communicate, as it has introduced a feature that allows more than 200,000 users in a group chat. However, if the users in a group chat move past 200,000, it changes into "Broadcast Group", but the feature comes with a restriction. Groups with close to 200k members can be converted to a Broadcast Group that allows unlimited members. Only admins can post in Broadcast Groups, but everyone can read along and participate in group Voice Chats," Telegram added.

Telegram Auto-Delete Messages in Any Chat

Some messages aren’t supposed to last forever. There are some Telegram groups and conversations where it’s best if messages are automatically deleted in a day or a week. Here’s how to auto-delete messages in any Telegram chat. You can enable the auto-delete feature on a per-chat basis. It works for both one-on-one conversations and group chats. Previously, you needed to use the Secret Chat feature to automatically delete messages after a set time. At the time of writing, you can choose to automatically delete messages after a day or a week. Telegram starts the timer once they are sent, not after they are read. This won’t affect the messages that were sent before enabling the feature.

Machine learning books and papers from us


Telegram Machine learning books and papers
FROM USA