Language Models
bocoel.GenerativeModel
Bases: Protocol
generate abstractmethod
generate(prompts: Sequence[str]) -> Sequence[str]
Generate a sequence of responses given prompts. The length of the response is the same as the prompt. The response would be a continuation of the prompt, and the prompts would be the prefix of the response.
Parameters
prompts: Sequence[str]
The prompts to generate responses from.
Returns
A sequence of responses. This has the same length as the prompt. Each response is a string.
Source code in bocoel/models/lms/interfaces/generative.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
bocoel.ClassifierModel
Bases: Protocol
choices property
choices: Sequence[str]
The choices for this language model.
classify
classify(prompts: Sequence[str]) -> NDArray
Generate logits given prompts.
Parameters
prompts: Sequence[str]
The prompts to generate logits from.
Returns
A list of logits Each logit has the same length given by each prompt's choices.
Source code in bocoel/models/lms/interfaces/classifiers.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
|
_classify abstractmethod
_classify(prompts: Sequence[str]) -> NDArray
Generate logits given prompts.
Parameters
prompts: Sequence[str]
The prompts to generate logits from.
choices: Sequence[str]
Number of choices for this batch of prompts.
Returns
A list of logits. Must have the shaep [batch_size, choices].
Source code in bocoel/models/lms/interfaces/classifiers.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
bocoel.HuggingfaceTokenizer
HuggingfaceTokenizer(model_path: str, device: str)
Source code in bocoel/models/lms/huggingface/tokenizers.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
tokenize
tokenize(prompts: Sequence[str])
Tokenize, pad, truncate, cast to device, and yield the encoded results. Returning BatchEncoding
but not marked in the type hint due to optional dependency.
Source code in bocoel/models/lms/huggingface/tokenizers.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
bocoel.HuggingfaceGenerativeLM
HuggingfaceGenerativeLM(model_path: str, batch_size: int, device: str)
Bases: GenerativeModel
The Huggingface implementation of LanguageModel. This is a wrapper around the Huggingface library, which would try to pull the model from the huggingface hub.
Since huggingface's tokenizer needs padding to the left to work, padding doesn't guarentee the same positional embeddings, and thus, results. If sameness with generating one by one is desired, batch size should be 1.
Source code in bocoel/models/lms/huggingface/generative.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
bocoel.HuggingfaceLogitsLM
HuggingfaceLogitsLM(
model_path: str, batch_size: int, device: str, choices: Sequence[str]
)
Bases: HuggingfaceGenerativeLM
, ClassifierModel
The Huggingface implementation of LanguageModel that uses logits in classification. This means that the model would use the logits of ['1', '2', '3', '4', '5'] as the output, if choices = 5
, for the current batch of inputs.
Source code in bocoel/models/lms/huggingface/logits.py
18 19 20 21 22 23 24 25 26 27 28 |
|
classify
classify(prompts: Sequence[str]) -> NDArray
Generate logits given prompts.
Parameters
prompts: Sequence[str]
The prompts to generate logits from.
Returns
A list of logits Each logit has the same length given by each prompt's choices.
Source code in bocoel/models/lms/interfaces/classifiers.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
|
bocoel.HuggingfaceSequenceLM
HuggingfaceSequenceLM(model_path: str, device: str, choices: Sequence[str])
Bases: ClassifierModel
Source code in bocoel/models/lms/huggingface/sequences.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
classify
classify(prompts: Sequence[str]) -> NDArray
Generate logits given prompts.
Parameters
prompts: Sequence[str]
The prompts to generate logits from.
Returns
A list of logits Each logit has the same length given by each prompt's choices.
Source code in bocoel/models/lms/interfaces/classifiers.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
|