The Dolphin 2.5 Mixtral 8x7b

·

2 min read

The Dolphin 2.5 Mixtral 8x7b

The Dolphin 2.5 Mixtral 8x7b is an AI language model developed by Eric Hartford and sponsored by convai, based on the Mixtral-8x7b architecture. This model is particularly fine-tuned for coding tasks, having been trained with a significant amount of coding data. It is designed to be very compliant and obedient, although it is not DPO (Data Processing Object) tuned, which means it may require some guidance to ensure it adheres to system trust. The model is uncensored, meaning it has been filtered to remove alignment and bias, making it highly compliant to any requests, including potentially unethical ones. Users are advised to implement their own alignment layer before using the model as a service.

The Mixtral 8x7b model operates on the principle of a Mixture of Experts (MoE) and is a compact version of a larger model like GPT-4. It uses shared attention parameters to reduce the overall model size without compromising performance. The architecture is designed for token inference using only 2 of the 8 experts, optimizing processing efficiency and speed. The model metadata includes a dimension of 4096, 32 layers, a head dimension of 128, a hidden dimension of 14336, 32 heads, 8 key/value heads, a norm epsilon of 1e-05, and a vocabulary size of 32000. The MoE configuration specifies 2 experts per token and a total of 8 experts.

To use the Dolphin 2.5 Mixtral 8x7b model, one can follow the instructions provided for setting it up and running it locally, such as using the Ollama framework. The model is accessible through various online platforms and can be integrated into different computing environments.

In summary, Dolphin 2.5 Mixtral 8x7b is an uncensored, coding-focused AI model based on the Mixtral-8x7b architecture, designed to be highly compliant and efficient in processing, but requiring an alignment layer for ethical use in applications.

Citations: