api
datatui.datatui(input_stream, collection_name, cache_name='annotations', pbar=True, description=None, content_render=lambda x: x['text'])
Main function to run the datatui application.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_stream |
list
|
A list of examples to annotate. |
required |
collection_name |
str
|
The name of the collection for these examples. |
required |
cache_name |
str
|
The name or path of the cache to use for storing annotations. |
'annotations'
|
pbar |
bool
|
Whether to display a progress bar. Defaults to True. |
True
|
description |
str
|
A description to display above each example. Defaults to None. |
None
|
content_render |
function
|
A function to render the content of each example. Defaults to lambda x: x['text']. |
lambda x: x['text']
|
This function initializes and runs the DatatuiApp, which provides a text-based user interface for annotating examples. It uses the provided cache to store annotations and allows users to navigate through examples, annotating them as 'yes', 'no', 'maybe', or skipping them.
datatui.new_batch(input_data, cache_name, collection_name, limit=150)
Read examples from a JSONL file or an iterable of dictionaries and return only those not present in the cache.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_data |
Union[str, Path, Iterable[Dict]]
|
Path to a JSONL file (as string or Path object) or an iterable of dictionaries containing examples. |
required |
cache_name |
str
|
Path to the cache directory. |
required |
collection_name |
str
|
Name of the collection for these examples. |
required |
limit |
int
|
Maximum number of uncached examples to return. If None, return all uncached examples. |
150
|
Returns:
Type | Description |
---|---|
List[Dict]
|
List[Dict]: A list of examples that are not present in the cache, up to the specified limit. |
Add content key with background highlighting for entities to a stream of dictionaries.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
examples |
Iterable[Dict]
|
An iterable of dictionaries, each containing 'text' and 'entity' keys. |
required |
Yields:
Name | Type | Description |
---|---|---|
Dict |
Iterable[Dict]
|
A dictionary with the original keys and an additional 'content' key containing highlighted text. |