Explore my latest work in MLOps, from automated pipelines to scalable deployment solutions
CodeReasoningPro is a large-scale synthetic dataset comprising 1,785,725 competitive programming problems in Python, created by XythicK, an MLOps Engineer.
A dataset of 62,941,756 chemistry questions covering Organic Chemistry (Alkenes, Nomenclature), Inorganic Chemistry (Oxidation States), and Physical Chemistry (Kinetics). Each row includes a question, 2-3 answers with explanations, and difficulty level.
This repository contains scripts and documentation for generating a synthetic dataset inspired by the Open-Orca dataset. The dataset consists of conversational instruction-response pairs, designed for natural language processing tasks.