Skip to content

Commit 60cc76e

Browse files
authored
Adding github assistant code (#340)
* Adding github assistant code * fmt * formating evaluation.py * lint * formating second attempt * remoivng extra spaces * formating index.py * formating file_summary.append * formating parse_document function * thrid attempt of fixing index.py * lint 4x * removing extra lines * Adding evaluation-result.txt * adding README.md
1 parent 6ddb6a2 commit 60cc76e

File tree

6 files changed

+580
-0
lines changed

6 files changed

+580
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# GitHub Assistant
2+
3+
Easily ask questions about your GitHub repository using RAG and Elasticsearch as a Vector database.
4+
5+
### How to use this code
6+
7+
1. Install Required Libraries:
8+
9+
```bash
10+
pip install -r requirements.txt
11+
```
12+
13+
2. Set Up Environment Variables
14+
`GITHUB_TOKEN`, `GITHUB_OWNER`, `GITHUB_REPO`, `GITHUB_BRANCH`, `ELASTIC_CLOUD_ID`, `ELASTIC_USER`, `ELASTIC_PASSWORD`, `ELASTIC_INDEX`, `OPENAI_API_KEY`
15+
16+
3. Index your data and create the embeddings by running:
17+
18+
```bash
19+
python index.py
20+
```
21+
22+
An Elasticsearch index will be generated, housing the embeddings. You can then connect to your ESS deployment and run search query against the index, you will see a new field named embeddings.
23+
24+
4. Ask questions about your codebase by running:
25+
26+
```bash
27+
python query.py
28+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
```
2+
Number of documents loaded: 5
3+
\All available questions generated:
4+
0. What is the purpose of chunking monitors in the updated push command as mentioned in the changelog?
5+
1. How does the changelog describe the improvement made to the performance of the push command?
6+
2. What new feature is added to the synthetics project when it is created via the `init` command?
7+
3. According to the changelog, what is the file size of the CHANGELOG.md document?
8+
4. On what date was the CHANGELOG.md file last modified?
9+
5. What is the significance of the example lightweight monitor yaml file mentioned in the changelog?
10+
6. How might the changes described in the changelog impact the workflow of users creating or updating monitors?
11+
7. What is the file path where the CHANGELOG.md document is located?
12+
8. Can you identify the issue numbers associated with the changes mentioned in the changelog?
13+
9. What is the creation date of the CHANGELOG.md file as per the context information?
14+
10. What type of file is the document described in the context information?
15+
11. On what date was the CHANGELOG.md file last modified?
16+
12. What is the file size of the CHANGELOG.md document?
17+
13. Identify one of the bug fixes mentioned in the CHANGELOG.md file.
18+
14. What command is referenced in the context of creating new synthetics projects?
19+
15. How does the CHANGELOG.md file address the issue of varying NDJSON chunked response sizes?
20+
16. What is the significance of the number #680 in the context of the document?
21+
17. What problem is addressed by skipping the addition of empty values for locations?
22+
18. How many bug fixes are explicitly mentioned in the provided context?
23+
19. What is the file path of the CHANGELOG.md document?
24+
20. What is the file path of the document being referenced in the context information?
25+
...
26+
27+
Generated questions:
28+
1. What command is referenced in relation to the bug fix in the CHANGELOG.md?
29+
2. On what date was the CHANGELOG.md file created?
30+
3. What is the primary purpose of the document based on the context provided?
31+
32+
Total number of questions generated: 3
33+
34+
Processing Question 1 of 3:
35+
36+
Evaluation Result:
37+
+---------------------------------------------------+-------------------------------------------------+----------------------------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
38+
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |
39+
+===================================================+=================================================+====================================================+======================+======================+===================+==================+==================+
40+
| What command is referenced in relation to the bug | The `init` command is referenced in relation to | Bug Fixes | Pass | YES | 1 | Pass | YES |
41+
| fix in the CHANGELOG.md? | the bug fix in the CHANGELOG.md. | | | | | | |
42+
| | | | | | | | |
43+
| | | - Pick the correct loader when bundling TypeScript | | | | | |
44+
| | | or JavaScript journey files | | | | | |
45+
| | | | | | | | |
46+
| | | during push command #626 | | | | | |
47+
+---------------------------------------------------+-------------------------------------------------+----------------------------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
48+
49+
Processing Question 2 of 3:
50+
51+
Evaluation Result:
52+
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
53+
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |
54+
+=================================================+================================================+==============================+======================+======================+===================+==================+==================+
55+
| On what date was the CHANGELOG.md file created? | The date mentioned in the CHANGELOG.md file is | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES |
56+
| | November 2, 2022. | | | | | | |
57+
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
58+
59+
Processing Question 3 of 3:
60+
61+
Evaluation Result:
62+
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
63+
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |
64+
+===================================================+===================================================+==============================+======================+======================+===================+==================+==================+
65+
| What is the primary purpose of the document based | The primary purpose of the document is to provide | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES |
66+
| on the context provided? | a changelog detailing the features and | | | | | | |
67+
| | improvements made in version 1.0.0-beta-38 of a | | | | | | |
68+
| | software project. It highlights specific | | | | | | |
69+
| | enhancements such as improved validation for | | | | | | |
70+
| | monitor schedules and an enhanced push command | | | | | | |
71+
| | experience. | | | | | | |
72+
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
73+
(clean_env) (base) framsouza@Frams-MacBook-Pro-2 git-assistant %
74+
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+------+------------------+
75+
76+
Processing Question 3 of 3:
77+
78+
Evaluation Result:
79+
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+-----------+------------------+
80+
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |Response | Faith Feedback |
81+
+===================================================+===================================================+==============================+======================+======================+===================+==================+==================+===========+==================+
82+
| What is the primary purpose of the document based | The primary purpose of the document is to provide | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES | | YES |
83+
| on the context provided? | a changelog detailing the features and | | | | | | | | |
84+
| | improvements made in version 1.0.0-beta-38 of a | | | | | | | | |
85+
| | software project. It highlights specific | | | | | | | | |
86+
| | enhancements such as improved validation for | | | | | | | | |
87+
| | monitor schedules and an enhanced push command | | | | | | | | |
88+
| | experience. | | | | | | | | |
89+
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+-----------+------------------+
90+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
import logging
2+
import sys
3+
import os
4+
import pandas as pd
5+
from dotenv import load_dotenv
6+
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Response
7+
from llama_index.core.evaluation import (
8+
DatasetGenerator,
9+
RelevancyEvaluator,
10+
FaithfulnessEvaluator,
11+
EvaluationResult,
12+
)
13+
from llama_index.llms.openai import OpenAI
14+
from tabulate import tabulate
15+
import textwrap
16+
import argparse
17+
import traceback
18+
from httpx import ReadTimeout
19+
20+
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
21+
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
22+
23+
parser = argparse.ArgumentParser(
24+
description="Process documents and questions for evaluation."
25+
)
26+
parser.add_argument(
27+
"--num_documents",
28+
type=int,
29+
default=None,
30+
help="Number of documents to process (default: all)",
31+
)
32+
parser.add_argument(
33+
"--skip_documents",
34+
type=int,
35+
default=0,
36+
help="Number of documents to skip at the beginning (default: 0)",
37+
)
38+
parser.add_argument(
39+
"--num_questions",
40+
type=int,
41+
default=None,
42+
help="Number of questions to process (default: all)",
43+
)
44+
parser.add_argument(
45+
"--skip_questions",
46+
type=int,
47+
default=0,
48+
help="Number of questions to skip at the beginning (default: 0)",
49+
)
50+
parser.add_argument(
51+
"--process_last_questions",
52+
action="store_true",
53+
help="Process last N questions instead of first N",
54+
)
55+
args = parser.parse_args()
56+
57+
load_dotenv(".env")
58+
59+
reader = SimpleDirectoryReader("/tmp/elastic/production-readiness-review")
60+
documents = reader.load_data()
61+
print(f"First document: {documents[0].text}")
62+
print(f"Second document: {documents[1].text}")
63+
print(f"Thrid document: {documents[2].text}")
64+
65+
66+
if args.skip_documents > 0:
67+
documents = documents[args.skip_documents :]
68+
69+
if args.num_documents is not None:
70+
documents = documents[: args.num_documents]
71+
72+
print(f"Number of documents loaded: {len(documents)}")
73+
74+
llm = OpenAI(model="gpt-4o", request_timeout=120)
75+
76+
data_generator = DatasetGenerator.from_documents(documents, llm=llm)
77+
78+
try:
79+
eval_questions = data_generator.generate_questions_from_nodes()
80+
if isinstance(eval_questions, str):
81+
eval_questions_list = eval_questions.strip().split("\n")
82+
else:
83+
eval_questions_list = eval_questions
84+
eval_questions_list = [q for q in eval_questions_list if q.strip()]
85+
86+
if args.skip_questions > 0:
87+
eval_questions_list = eval_questions_list[args.skip_questions :]
88+
89+
if args.num_questions is not None:
90+
if args.process_last_questions:
91+
eval_questions_list = eval_questions_list[-args.num_questions :]
92+
else:
93+
eval_questions_list = eval_questions_list[: args.num_questions]
94+
95+
print("\All available questions generated:")
96+
for idx, q in enumerate(eval_questions):
97+
print(f"{idx}. {q}")
98+
99+
print("\nGenerated questions:")
100+
for idx, q in enumerate(eval_questions_list, start=1):
101+
print(f"{idx}. {q}")
102+
except ReadTimeout as e:
103+
print(
104+
"Request to Ollama timed out during question generation. Please check the server or increase the timeout duration."
105+
)
106+
traceback.print_exc()
107+
sys.exit(1)
108+
except Exception as e:
109+
print(f"An error occurred while generating questions: {e}")
110+
traceback.print_exc()
111+
sys.exit(1)
112+
113+
print(f"\nTotal number of questions generated: {len(eval_questions_list)}")
114+
115+
evaluator_relevancy = RelevancyEvaluator(llm=llm)
116+
evaluator_faith = FaithfulnessEvaluator(llm=llm)
117+
118+
vector_index = VectorStoreIndex.from_documents(documents)
119+
120+
121+
def display_eval_df(
122+
query: str,
123+
response: Response,
124+
eval_result_relevancy: EvaluationResult,
125+
eval_result_faith: EvaluationResult,
126+
) -> None:
127+
relevancy_feedback = getattr(eval_result_relevancy, "feedback", "")
128+
relevancy_passing = getattr(eval_result_relevancy, "passing", False)
129+
relevancy_passing_str = "Pass" if relevancy_passing else "Fail"
130+
131+
relevancy_score = 1.0 if relevancy_passing else 0.0
132+
133+
faithfulness_feedback = getattr(eval_result_faith, "feedback", "")
134+
faithfulness_passing_bool = getattr(eval_result_faith, "passing", False)
135+
faithfulness_passing = "Pass" if faithfulness_passing_bool else "Fail"
136+
137+
def wrap_text(text, width=50):
138+
if text is None:
139+
return ""
140+
text = str(text)
141+
text = text.replace("\r", "")
142+
lines = text.split("\n")
143+
wrapped_lines = []
144+
for line in lines:
145+
wrapped_lines.extend(textwrap.wrap(line, width=width))
146+
wrapped_lines.append("")
147+
return "\n".join(wrapped_lines)
148+
149+
if response.source_nodes:
150+
source_content = wrap_text(response.source_nodes[0].node.get_content())
151+
else:
152+
source_content = ""
153+
154+
eval_data = {
155+
"Query": wrap_text(query),
156+
"Response": wrap_text(str(response)),
157+
"Source": source_content,
158+
"Relevancy Response": relevancy_passing_str,
159+
"Relevancy Feedback": wrap_text(relevancy_feedback),
160+
"Relevancy Score": wrap_text(str(relevancy_score)),
161+
"Faith Response": faithfulness_passing,
162+
"Faith Feedback": wrap_text(faithfulness_feedback),
163+
}
164+
165+
eval_df = pd.DataFrame([eval_data])
166+
167+
print("\nEvaluation Result:")
168+
print(
169+
tabulate(
170+
eval_df, headers="keys", tablefmt="grid", showindex=False, stralign="left"
171+
)
172+
)
173+
174+
175+
query_engine = vector_index.as_query_engine(llm=llm)
176+
177+
total_questions = len(eval_questions_list)
178+
for idx, question in enumerate(eval_questions_list, start=1):
179+
try:
180+
response_vector = query_engine.query(question)
181+
eval_result_relevancy = evaluator_relevancy.evaluate_response(
182+
query=question, response=response_vector
183+
)
184+
eval_result_faith = evaluator_faith.evaluate_response(response=response_vector)
185+
186+
print(f"\nProcessing Question {idx} of {total_questions}:")
187+
display_eval_df(
188+
question, response_vector, eval_result_relevancy, eval_result_faith
189+
)
190+
except ReadTimeout as e:
191+
print(f"Request to OpenAI timed out while processing question {idx}.")
192+
traceback.print_exc()
193+
continue
194+
except Exception as e:
195+
print(f"An error occurred while processing question {idx}: {e}")
196+
traceback.print_exc()
197+
continue

0 commit comments

Comments
 (0)