A privacy-first LLM for enterprises
Palmyra, developer by Writer, was primarily pre-trained with English text. Note that there is still a trace amount of non-English data present within the training corpus that was accessed through CommonCrawl. A causal language modeling (CLM) objective was utilized during the process of the model's pretraining.
Palmyra comes in three sizes: 128 million, 5 billion or 20 billion parameters, respectively, for Small, Base and Large. They’re trained on business and marketing writing, not Reddit posts and Project Gutenberg, so there are less surprises to begin with. Then you load up its maw with the last 10 years of annual reports, financials, blog posts and so on to make it yours. (This and any derived data do not filter back to Writer, to be clear.)
Similar to GPT-3, Palmyra Base is a member of the same family of models that only contain a decoder. As a result, it was pre-trained utilizing the objective of self-supervised causal language modeling. Palmyra Base uses the prompts and general experimental setup from GPT-3 in order to conduct its evaluation per GPT-3.