Skip to content

Conversation

@kartik-raj7
Copy link

Description

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Code refactoring
  • Performance improvement
  • Other (please specify): Draft Pr for task Add sentiment classification task #1346

Screenshots/Videos (if applicable)

Pre-submission Checklist

  • I have tested my changes thoroughly before submitting this PR
  • This PR contains minimal changes necessary to address the issue/feature
  • My code follows the project's coding standards and style guidelines
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if applicable)
  • All new and existing tests pass
  • I have searched existing PRs to ensure this change hasn't been submitted already
  • I have linked any relevant issues in the description
  • My commits have clear and descriptive messages

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin.

@pull-checklist
Copy link

Please make sure all the checkboxes are checked:

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have added end-to-end and unit tests (if applicable).
  • I have updated the documentation and README.md file (if necessary).
  • I have removed unnecessary code and debug statements.
  • PR title is clear and follows the convention.
  • I have tagged reviewers or team members for feedback.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 28, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Search now optionally runs sentiment analysis using the last user interaction (when save_interaction=true) before dataset validation and search delegation. A new CogneeSearchSentiment model and an async run_sentiment_analysis task (re-exported) are added, plus a demo script showing usage.

Changes

Cohort / File(s) Summary of edits
Search flow augmentation
cognee/api/v1/search/search.py
Added imports (CogneeUserInteraction, get_graph_engine, run_sentiment_analysis, get_logger), initialized logger/graph engine, fetched last user interaction, and conditionally invoked sentiment analysis before dataset normalization and delegating to the existing search function.
Sentiment data model
cognee/modules/retrieval/utils/models.py
Added public class CogneeSearchSentiment(DataPoint) with fields: prev_question, prev_answer, current_question, sentiment (str), score (int), user_id (str), and optional belongs_to_set.
Sentiment task export
cognee/tasks/sentiment_analysis/__init__.py
Re-exported run_sentiment_analysis to make the task publicly importable.
Sentiment task implementation
cognee/tasks/sentiment_analysis/sentiment_analysis.py
Implemented async run_sentiment_analysis(prev_question, prev_answer, current_question, user): composes conversation text, calls LLM gateway for structured sentiment output, maps response into CogneeSearchSentiment, persists via add_data_points, and returns a compact dict with sentiment and score.
Example script
examples/python/sentiment_analysis_demo.py
New demo script showing pruning, ingestion/cognify, and multiple searches with save_interaction=True under an asyncio entrypoint.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant API as Search API
  participant Graph as Graph Engine
  participant SA as Sentiment Task
  participant LLM as LLM Gateway
  participant Store as Data Store

  User->>API: search(query_text, save_interaction=True)
  API->>Graph: init() + get_last_user_interaction(limit=1)
  alt last interaction exists AND save_interaction
    API->>SA: run_sentiment_analysis(prev_q, prev_a, current_q, user)
    SA->>LLM: acreate_structured_output(conversation context)
    LLM-->>SA: { "sentiment": "...", "score": ... }
    SA->>Store: add_data_points(CogneeSearchSentiment)
    SA-->>API: { prev_q, prev_a, current_q, sentiment, score }
  else No prior interaction or save_off
    API-->>API: skip sentiment analysis
  end
  API->>API: validate/normalize datasets
  API->>API: delegate to existing search_function(...)
  API-->>User: search results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

run-checks

Suggested reviewers

  • borisarzentar

Poem

A twitch, a hop, I read the past,
I scent the tone and score it fast.
I log the mood with whiskered care,
Then bound to fetch the answers there.
Happy hops — your searches fair. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description follows the template structure but mislabels the nature of the change by marking “Other (Draft PR for task #1346)” instead of selecting “New feature,” and several pre-submission checklist items (tests, documentation, coding standards) remain unchecked, reducing clarity on readiness. Please update the Type of Change to “New feature” to reflect the added functionality, check the applicable pre-submission checklist items (including tests and documentation), and consider expanding the Description section with key implementation details to provide full context.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "feat: add sentiment classification task #1346" is directly and specifically related to the core changes in the pull request. The changeset introduces a new sentiment analysis task (the run_sentiment_analysis function), a new data model class CogneeSearchSentiment, integration of sentiment analysis into the search workflow, and supporting infrastructure. The title accurately and concisely captures the primary purpose of the changes—adding sentiment classification functionality—without being vague or overly broad. The title is clear enough for developers scanning the history to understand the main change at a glance.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kartik-raj7 kartik-raj7 marked this pull request as ready for review September 28, 2025 21:22
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cognee/api/v1/search/search.py (1)

175-213: Pass a resolved user instance into sentiment analysis

Right now we call run_sentiment_analysis before user is resolved, and we pass the User class object instead of an instance. With the default path (user arg is None), this raises on str(user.id) in the task every time sentiment analysis runs. Please move the if user is None: user = await get_default_user() block above this section and pass the actual user variable. Also drop the unused results assignment.

Suggested diff:

-    graph_engine = await get_graph_engine()
+    graph_engine = await get_graph_engine()
@@
-    # if last interaction is present and save interaction is enabled call sentiment analysis
-    if len(last_interactions)>0 and save_interaction:
-        results = await run_sentiment_analysis(
-        prev_question=last_interactions[0]['question'],
-        prev_answer=last_interactions[0]['answer'],
-        current_question=query_text,
-        user=User)
-        # print(results)
-        # print('here i am and this is interaction',last_interactions[0]['question'],last_interactions[0]['answer'],query_text)
+    if user is None:
+        user = await get_default_user()
+
+    if last_interactions and save_interaction:
+        await run_sentiment_analysis(
+            prev_question=last_interactions[0]["question"],
+            prev_answer=last_interactions[0]["answer"],
+            current_question=query_text,
+            user=user,
+        )
@@
-    if user is None:
-        user = await get_default_user()
-
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 107b5af and 6fc6750.

📒 Files selected for processing (5)
  • cognee/api/v1/search/search.py (2 hunks)
  • cognee/demo.py (1 hunks)
  • cognee/modules/retrieval/utils/models.py (1 hunks)
  • cognee/tasks/sentiment_analysis/__init__.py (1 hunks)
  • cognee/tasks/sentiment_analysis/sentiment_analysis.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
cognee/modules/retrieval/utils/models.py (2)
cognee/infrastructure/engine/models/DataPoint.py (1)
  • DataPoint (20-220)
cognee/modules/engine/models/node_set.py (1)
  • NodeSet (4-7)
cognee/tasks/sentiment_analysis/__init__.py (1)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (1)
  • run_sentiment_analysis (11-43)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (3)
cognee/infrastructure/llm/LLMGateway.py (1)
  • LLMGateway (6-164)
cognee/modules/engine/models/node_set.py (1)
  • NodeSet (4-7)
cognee/modules/retrieval/utils/models.py (1)
  • CogneeSearchSentiment (44-52)
cognee/api/v1/search/search.py (5)
cognee/modules/retrieval/utils/models.py (1)
  • CogneeUserInteraction (8-14)
cognee/infrastructure/databases/graph/get_graph_engine.py (1)
  • get_graph_engine (10-24)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (1)
  • run_sentiment_analysis (11-43)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (2)
  • get_last_user_interaction_ids (1328-1349)
  • get_node (685-705)
cognee/infrastructure/databases/graph/graph_db_interface.py (1)
  • get_node (229-238)
cognee/demo.py (2)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (1)
  • run_sentiment_analysis (11-43)
cognee/api/v1/search/search.py (1)
  • search (15-223)
🪛 Ruff (0.13.1)
cognee/api/v1/search/search.py

186-186: Local variable results is assigned to but never used

Remove assignment to unused variable results

(F841)

@lxobr lxobr self-requested a review September 29, 2025 09:14
@lxobr
Copy link
Collaborator

lxobr commented Sep 29, 2025

Hey @kartik-raj7, thanks so much for your contribution! I really appreciate the time and effort you’ve put in, and I’m looking forward to reviewing this one.

A couple of quick notes:
1. I see you have two draft PRs open. Could you please close the one that’s not relevant, and update the PR title to follow the format “feat: issue title”? You can reference the issue in the description.
2. There are a few commented-out print statements in the code. Could you remove those, please?
3. I really like that you included a demo.py script! At the moment, it doesn’t seem to fully demonstrate the implemented functionality. Could you double-check that, rename it to something more descriptive, and move it into examples/python?

Once you’ve had a chance to address these points, I’ll be happy to take another look. Thanks again for your work on this! Really glad to have your contribution here and looking forward to supporting and collaborating

@kartik-raj7
Copy link
Author

kartik-raj7 commented Sep 29, 2025

Hey @kartik-raj7, thanks so much for your contribution! I really appreciate the time and effort you’ve put in, and I’m looking forward to reviewing this one.

A couple of quick notes: 1. I see you have two draft PRs open. Could you please close the one that’s not relevant, and update the PR title to follow the format “feat: issue title”? You can reference the issue in the description. 2. There are a few commented-out print statements in the code. Could you remove those, please? 3. I really like that you included a demo.py script! At the moment, it doesn’t seem to fully demonstrate the implemented functionality. Could you double-check that, rename it to something more descriptive, and move it into examples/python?

Once you’ve had a chance to address these points, I’ll be happy to take another look. Thanks again for your work on this! Really glad to have your contribution here and looking forward to supporting and collaborating

Hey @lxobr, I’ve closed the irrelevant PR. Actually, this was a draft PR, and I mistakenly clicked the review option. I’ve integrated the sentiment analysis with the search. You can run it using sentiment_analysis_demo.py in python/examples.

I also have a couple of questions:

  1. What happens to the data points created from sentiment analysis—do they need to be connected to each other?

  2. Is my approach to sentiment analysis correct, considering the previous question and the current question, for determining the user’s sentiment for Cognee search results?

  3. Am I headed in the right direction with this implementation?

  4. Could you guide me on what needs to be done for the Memify pipeline? I couldn’t figure it out.

  5. For the CogneeSearchSentiment data point, should I define an Enum for the sentiment field like it was done for UserFeedbackSentiment, or keep it as a plain string?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cognee/api/v1/search/search.py (1)

198-201: Logic bug: stray brackets force the datasets branch to always execute

if datasets is not None and [all(...)] is always truthy because [False] is truthy. This will incorrectly treat UUID lists as string names.

Apply:

-    if datasets is not None and [all(isinstance(dataset, str) for dataset in datasets)]:
+    if datasets is not None and all(isinstance(dataset, str) for dataset in datasets):
🧹 Nitpick comments (2)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (2)

5-6: Align NodeSet import path with existing modules

Other modules import NodeSet from ...models.node_set. Align to avoid package export assumptions.

-from cognee.modules.engine.models import NodeSet
+from cognee.modules.engine.models.node_set import NodeSet

25-33: Make sentiment data point IDs collision‑resistant

Using only user_id + current_question will overwrite prior entries when the same user asks the same question again with different prior context. Include prior Q/A (or a timestamp/UUID4).

-        id=uuid5(NAMESPACE_OID, name=user_id + current_question),
+        id=uuid5(NAMESPACE_OID, name=f"{user_id}|{prev_question}|{prev_answer}|{current_question}"),

Optionally clamp/cast score before persist:

-        score=sentiment_result.score,
+        score=int(max(-5, min(5, sentiment_result.score))),
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fc6750 and 3e9cbda.

📒 Files selected for processing (3)
  • cognee/api/v1/search/search.py (2 hunks)
  • cognee/tasks/sentiment_analysis/sentiment_analysis.py (1 hunks)
  • examples/python/sentiment_analysis_demo.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (3)
cognee/infrastructure/llm/LLMGateway.py (1)
  • LLMGateway (6-164)
cognee/modules/engine/models/node_set.py (1)
  • NodeSet (4-7)
cognee/modules/retrieval/utils/models.py (1)
  • CogneeSearchSentiment (44-52)
examples/python/sentiment_analysis_demo.py (2)
cognee/api/v1/search/search.py (1)
  • search (15-221)
cognee-mcp/src/server.py (1)
  • save_interaction (286-338)
cognee/api/v1/search/search.py (6)
cognee/modules/retrieval/utils/models.py (1)
  • CogneeUserInteraction (8-14)
cognee/infrastructure/databases/graph/get_graph_engine.py (1)
  • get_graph_engine (10-24)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (1)
  • run_sentiment_analysis (11-41)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (2)
  • get_last_user_interaction_ids (1328-1349)
  • get_node (685-705)
cognee/infrastructure/databases/graph/kuzu/adapter.py (2)
  • get_last_user_interaction_ids (1650-1671)
  • get_node (787-822)
cognee/infrastructure/databases/graph/graph_db_interface.py (1)
  • get_node (229-238)
🪛 Ruff (0.13.1)
cognee/api/v1/search/search.py

186-186: Local variable results is assigned to but never used

Remove assignment to unused variable results

(F841)

🔇 Additional comments (3)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (1)

19-23: Fix structured-output schema: don’t pass DataPoint model to the LLM response

Use a minimal Pydantic model for the LLM output, then map into CogneeSearchSentiment. This prevents validation failure on required fields.

Apply:

-from cognee.modules.retrieval.utils.models import CogneeSearchSentiment
+from cognee.modules.retrieval.utils.models import CogneeSearchSentiment
+from pydantic import BaseModel, Field
@@
-    sentiment_result = await LLMGateway.acreate_structured_output(
+    class SentimentClassification(BaseModel):
+        sentiment: str
+        score: int = Field(ge=-5, le=5)
+
+    sentiment_result = await LLMGateway.acreate_structured_output(
         text_input=text_input,
         system_prompt="""Classify the user's reaction as Positive, Neutral, or Negative with a score (-5 to 5).Return the result as valid JSON like:{"sentiment": "Positive","score": 3}""",
-        response_model= CogneeSearchSentiment  
+        response_model=SentimentClassification
     )
cognee/api/v1/search/search.py (2)

12-12: Import path is correct, no change needed

get_graph_engine is re-exported in cognee/infrastructure/databases/graph/__init__.py, so the existing import works as intended.


172-183: Guard graph client initialization & normalize adapter response shapes

File: cognee/api/v1/search/search.py:172-183

  • Only call get_graph_engine() when save_interaction is True to avoid unnecessary overhead.
  • Normalize the output of graph_engine.get_node() across adapters (e.g. unwrap Kuzu’s {properties: …}) so question/answer fields can be accessed uniformly.
@kartik-raj7 kartik-raj7 changed the title Draft Pr For Task #1346 Sep 29, 2025
@lxobr
Copy link
Collaborator

lxobr commented Sep 29, 2025

Hey @kartik-raj7 , thanks for the updates! I will go over your code and questions tomorrow and will be in touch. In the meanwhile, I would like to kindly ask you to address and resolve the comments raised by the coderabbit. Thanks so much for your work on this and I will get back to you shortly.

@hande-k hande-k added the community-contribution Community contribution label label Sep 29, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cognee/api/v1/search/search.py (1)

208-214: Bug: square brackets force the condition to always be truthy

[all(...)] creates a non-empty list, so the branch runs for any non-None datasets. Fix by removing the brackets.

-    if datasets is not None and [all(isinstance(dataset, str) for dataset in datasets)]:
+    if datasets is not None and all(isinstance(dataset, str) for dataset in datasets):
         datasets = await get_authorized_existing_datasets(datasets, "read", user)
         datasets = [dataset.id for dataset in datasets]
         if not datasets:
             raise DatasetNotFoundError(message="No datasets found.")
🧹 Nitpick comments (6)
examples/python/sentiment_analysis_demo.py (3)

1-3: Add a brief module docstring noting data reset and requirements

Clarify that the demo wipes local data and needs LLM_API_KEY to run.

+"""Sentiment Analysis demo.
+
+- WARNING: prunes local data and system metadata.
+- Requires LLM_API_KEY for LLM-backed search.
+"""
 import cognee
 import asyncio

4-12: Fail fast if LLM_API_KEY is missing

Provide a clear message rather than failing later during search.

 async def main():
+    import os
+    if not os.getenv("LLM_API_KEY"):
+        print("Set LLM_API_KEY to run LLM-backed search.")
+        return
     await cognee.prune.prune_data()
     await cognee.prune.prune_system(metadata=True)

27-31: Improve output readability for demo users

Pretty-print the results to make the demo more informative.

-    for query, res in all_results.items():
-        print(f"\nQuery: {query}")
-        for r in res:
-            print(f"  - {r}")
+    import pprint
+    pp = pprint.PrettyPrinter(indent=2, width=100)
+    for query, res in all_results.items():
+        print(f"\nQuery: {query}")
+        pp.pprint(res)
+    print("\nNote: Sentiment runs starting with the second query (uses prior interaction).")
cognee/api/v1/search/search.py (3)

175-176: Minor: merge isinstance calls

Use the tuple form to simplify.

-    if isinstance(datasets, UUID) or isinstance(datasets, str):
+    if isinstance(datasets, (UUID, str)):
         datasets = [datasets]

205-206: Log the stack trace and avoid generic “error” log

Use logger.exception and drop the unused exception variable.

-        except Exception as e:
-            logger.error(f"Sentiment Analysis Failed: {e}")
+        except Exception:
+            logger.exception("Sentiment analysis failed")

This also addresses lints BLE001/TRY400. [Based on learnings]


181-204: Direction check: sentiment integration looks good; a few product decisions

  • Approach: Using (prev_question, prev_answer, current_question) to infer reaction to the previous answer is sound.
  • Storage/linking: Also link each CogneeSearchSentiment to the corresponding CogneeUserInteraction (DERIVED_FROM) and to the user or user set. Adds traceability over time.
  • Enum vs string: Prefer an Enum for sentiment (align with UserFeedbackSentiment) for consistency and validation; keep a numeric score for sorting.
  • Resilience: Current try/except + non-blocking is appropriate; consider sampling/ratelimiting if save_interaction is high-traffic.
  • Memify pipeline: Run sentiment as part of the save_interaction flow (post-ingest) or via a small background task triggered there; keep search path fast and optional.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e9cbda and 0a53eec.

📒 Files selected for processing (2)
  • cognee/api/v1/search/search.py (2 hunks)
  • examples/python/sentiment_analysis_demo.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
examples/python/sentiment_analysis_demo.py (2)
cognee/api/v1/search/search.py (1)
  • search (16-231)
cognee-mcp/src/server.py (1)
  • save_interaction (286-338)
cognee/api/v1/search/search.py (9)
cognee/modules/retrieval/utils/models.py (1)
  • CogneeUserInteraction (8-14)
cognee/infrastructure/databases/graph/get_graph_engine.py (1)
  • get_graph_engine (10-24)
cognee/tasks/sentiment_analysis/sentiment_analysis.py (1)
  • run_sentiment_analysis (11-41)
cognee/shared/logging_utils.py (1)
  • get_logger (182-194)
cognee/infrastructure/databases/graph/kuzu/adapter.py (2)
  • get_last_user_interaction_ids (1650-1671)
  • get_node (787-822)
cognee/infrastructure/databases/graph/neo4j_driver/adapter.py (2)
  • get_last_user_interaction_ids (1328-1349)
  • get_node (685-705)
cognee/modules/graph/cognee_graph/CogneeGraph.py (1)
  • get_node (46-47)
cognee/infrastructure/databases/graph/graph_db_interface.py (1)
  • get_node (229-238)
cognee/infrastructure/databases/graph/memgraph/memgraph_adapter.py (1)
  • get_node (649-656)
🪛 Ruff (0.13.1)
cognee/api/v1/search/search.py

205-205: Do not catch blind exception: Exception

(BLE001)


206-206: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🪛 Pylint (3.3.8)
cognee/api/v1/search/search.py

[refactor] 175-175: Consider merging these isinstance calls to isinstance(datasets, (UUID, str))

(R1701)

@kartik-raj7
Copy link
Author

Hey @kartik-raj7 , thanks for the updates! I will go over your code and questions tomorrow and will be in touch. In the meanwhile, I would like to kindly ask you to address and resolve the comments raised by the coderabbit. Thanks so much for your work on this and I will get back to you shortly.

Hey @Ixobr, Were you able to have a look at it? Please let me know if you have suggestions and anythings that I need to improve.

@lxobr
Copy link
Collaborator

lxobr commented Oct 13, 2025

Hey @kartik-raj7! Thanks so much for your work on this so far. And sorry for the late reply, I was off for a couple of days. Let me clarify the scope of this issue/PR since it’s a bit ambiguously worded.

There are two main parts to implement.

First, we need to create a task that processes all saved interactions. This is the task you’ve already started working on. Here are some additional details about what the task should do:

  • Fetch all interactions (check out the get_filtered_graph_data function and how it’s used).
    For now, it’s fine to process all interactions. We can add filtering in a follow-up issue.
  • For each interaction, run sentiment analysis on the question.
    • Each question should be evaluated in isolation, we’re not yet trying to establish any chain of previous or next questions.
  • If sentiment is non-neutral, turn it into a custom data point.
  • Skip neutral ones since they don’t add value.
  • The task should return a list of data points (or an empty list if none).
  • Log when no data points are created.
    I would suggest you check the codebase to see what happens when save_interaction is enabled in search. Feel free to reach out if anything is unclear.

Second, we need to create a custom memify pipeline that uses the task. This essentially means creating your own version of this example. For this issue, you should be able to keep it minimal by setting:

  • extraction_task = your new task
  • enrichment_task = add_data_points

This can all live in your demo, which you’ve already started working on. Just make sure to first run regular cognify on some text, perform a few searches (some neutral, some not), and then run your memify pipeline demonstrating that the sentiment analysis works.

Thanks again for all the time and effort you are investing in this. I am looking forward to hearing from you and supporting you to get this PR through!

zanieb and others added 2 commits October 17, 2025 16:27
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

Closes topoteretes#1558

## Type of Change
<!-- Please check the relevant option -->
- [x] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [x] **I have tested my changes thoroughly before submitting this PR**
- [x] **This PR contains minimal changes necessary to address the
issue/feature**
- [x] My code follows the project's coding standards and style
guidelines
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added necessary documentation (if applicable)
- [x] All new and existing tests pass
- [x] I have searched existing PRs to ensure this change hasn't been
submitted already
- [x] I have linked any relevant issues in the description
- [x] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
user_id=user_id,
belongs_to_set=NodeSet(id=uuid5(NAMESPACE_OID, "CogneeSearchSentiment"), name="CogneeSearchSentiment")
)
await add_data_points(data_points=[sentiment_data_point], update_edge_collection=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_edge_collection is removed from add_data_points, only data_points should be sent.

last = props
break
if last:
await run_sentiment_analysis(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running sentiment analysis on search request will slow down the search. We are currently trying to speed it up, so would be good to have this running somehow independently.
When/where do you need that sentiment analysis? What is the use case?

@Vasilije1990 Vasilije1990 changed the base branch from main to dev October 19, 2025 14:12
hande-k and others added 4 commits October 20, 2025 12:08
<!-- .github/pull_request_template.md -->

## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ x] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- .github/pull_request_template.md -->

## Description
Add scorecard CI/CD for github secure open source fund

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
@kartik-raj7
Copy link
Author

Hey @lxobr ,

Here’s an update on what I’ve done so far:

  • Fetching Interactions: I used the get_graph_data function to fetch all interactions, as I faced issues with attribute filters in get_filtered_graph_data.

  • Sentiment Analysis Task: I’ve implemented the task to process each question and perform sentiment analysis. It returns sentiment data points for non-neutral questions, and neutral ones are skipped.

  • Enrichment Task: I’ve written the enrichment task to store the sentiment data points created by the sentiment analysis task.

  • Memify Pipeline: The memify pipeline is set up with the sentiment analysis task as the extraction task and the enrichment task to handle the storage of data points.

Where I need help:

I’m unsure about how to pass the list of sentiment data points from the sentiment analysis task to add_data_points in the enrichment task within the memify pipeline.

Could you guide me on how to correctly pass this data between tasks in memify? Also, please let me know if my work so far aligns with your expectations and how I should proceed.

@gitguardian
Copy link

gitguardian bot commented Oct 22, 2025

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
9573981 Triggered Generic Password f240035 .github/workflows/e2e_tests.yml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@lxobr
Copy link
Collaborator

lxobr commented Oct 28, 2025

Hey @kartik-raj7 , thanks so much for all the work you've done so far! Great to see this shaping up 🚀

Currently, the demo isn’t working for me. Could you please confirm whether it’s running on your end? I won’t be able to proceed until this is resolved.

I also took a look at the code you submitted and had a few general remarks that I hope you’ll find helpful:

  • We avoid using print; please use the existing logging framework instead.
  • Typically, you should pass a lighter model to the LLM and then create a datapoint from that, so the LLM doesn’t receive unnecessary fields. You can find similar patterns throughout the codebase.
  • The pipeline shouldn’t be in the tasks folder on its own. For now, it’s fine for the pipeline to be directly integrated into the demo example. Could you please move it?
  • It might be helpful to review the recently merged feedback enrichment memify example. I think you’ll find several relevant patterns there.

Thanks again for all the time and effort you’re putting into this. I’m looking forward to seeing the improvements and reviewing the next iteration. 💪

@borisarzentar borisarzentar changed the title Feat: Add sentiment classification task #1346 Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Community contribution label

8 participants