nyx_extras¶
NYC client SDK optional extras for langchain and auto-parsing.
Submodules¶
Classes¶
An opinionated client wrapping langChain to evaluate user queries against contents of a Nyx network. |
|
A class for processing and querying datasets from Data instances. |
|
Utility methods for query (system) prompt modification. |
Package Contents¶
- class nyx_extras.NyxLangChain(config=None, llm=None, log_level=logging.WARN, system_prompt=None)¶
Bases:
nyx_client.client.NyxClientAn opinionated client wrapping langChain to evaluate user queries against contents of a Nyx network.
This class extends NyxClient to provide LangChain-based functionality for querying Nyx network contents.
Note
The LLM must support tool calling.
- Parameters:
- query(query, data=None, include_own=False, sqlite_file=None)¶
Query the LLM with a user prompt and context from Nyx.
This method takes a user prompt and invokes it against the LLM associated with this instance, using context from Nyx.
- Parameters:
query (str) – The user input.
data (collections.abc.Sequence[nyx_client.data.Data] | None) – Sequence of data to use for context. If not specified, uses all subscribed data.
include_own (bool) – Include your own data, created in Nyx, in the query.
sqlite_file (str | None) – A file location to write the sql_lite file to.
update_subscribed) – if set to true this will re-poll Nyx for subscribed data
- Returns:
The answer from the LLM.
- Return type:
Note
If the data list is not provided, this method updates subscriptions and retrieves all subscribed data.
- class nyx_extras.Parser¶
A class for processing and querying datasets from Data instances.
This class provides methods to convert data into SQL databases or vector representations, and to perform queries on the processed data.
- vectors¶
The TF-IDF vector representations of the processed content.
- vectorizer¶
The TfidfVectorizer instance used for creating vectors.
- chunks¶
The text chunks created from the processed content.
- static data_as_db(data, additional_information=None, sqlite_file=None, if_exists='replace')¶
Process the content of multiple Data instances into an in-memory SQLite database.
This method downloads the content of each Data (if it’s a CSV) and converts it to an in-memory SQLite database. The resulting database engine is then returned for use with language models.
- Parameters:
data (list[nyx_client.data.Data]) – A list of Data instances to process.
additional_information (VectorResult | None) – List of additional information to be stored in the DB as a fallback
sqlite_file (str | None) – Provide a file for the database to reside in
if_exists (Literal['fail', 'replace', 'append']) – What to do if a table already exists Defaults to “fail” can be “fail”, “append”, “replace”
- Returns:
An SQLAlchemy engine.Engine instance for the in-memory SQLite database.
- Return type:
sqlalchemy.engine.Engine
Note
If the list of data is empty, an empty database engine is returned.
- static normalise_values(values)¶
Normalise names in a list of values.
- Parameters:
values (collections.abc.Sequence[str]) – A sequence of values to normalise.
- Returns:
A list of normalised values.
- Return type:
- data_as_vectors(data, chunk_size=1000)¶
Process the content of multiple Data instances into vector representations.
This method downloads the content of each Data, combines it, chunks it, and creates a TF-IDF vectorizer for the chunks.
- Parameters:
data (collections.abc.Sequence[nyx_client.data.Data]) – A sequence of Data instances to process.
chunk_size (int) – The size of each chunk when splitting the content. Defaults to 1000.
- Returns:
The current Parser instance with updated vectors, vectorizer, and chunks.
Note
If no content is found in any of the data, the method returns without processing.
- query(text, k=3)¶
Query the processed data with a given text.
This method transforms the input text into a vector using the fitted vectorizer, and then finds the most similar chunks to this query vector.
- Parameters:
- Returns:
An object containing the top k matching chunks, their similarities, and associated metadata. If the vectorizer is not initialized, it returns a VectorResult indicating failure.
- Return type:
VectorResult
Note
This method assumes that self.vectorizer has been properly initialized. If self.vectorizer is None, it returns a VectorResult indicating failure.
- find_matching_chunk(query_vector, k=3)¶
Find the most similar chunks to the query vector.
This method computes the cosine similarity between the query vector and all document vectors, then returns the top k most similar chunks along with their similarities and metadata.
- Parameters:
query_vector (Any) – The vector representation of the query.
k (int) – The number of top matching chunks to return. Defaults to 3.
- Returns:
An object containing the top k matching chunks, their similarities, and associated metadata. If no vectors are available, it returns a VectorResult with empty lists and a failure message.
- Return type:
VectorResult
Note
This method assumes that self.vectors, self.chunks, and self.metadata have been properly initialized. If self.vectors is None, it returns a VectorResult indicating failure.