Skip to content

Added adopter to test llms for real-word-dataset : ManyType4Py #9

New issue

Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? No Sign in to your account

Open
wants to merge 162 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
162 commits
Select commit Hold shift + click to select a range
e7baee9
Update README.md
ashwinprasadme Oct 18, 2023
96a5af7
Cleanup
ashwinprasadme Oct 20, 2023
d9b37a1
Add Leaderboard
ashwinprasadme Oct 20, 2023
03a6025
Dockerfile, Leaderboard generation, Minor fixes
ashwinprasadme Oct 20, 2023
3678ed7
Minor README
ashwinprasadme Oct 21, 2023
aedffec
Leaderboard update
ashwinprasadme Oct 21, 2023
1197cad
Update Readme
ashwinprasadme Oct 23, 2023
44eecc7
ollama support initial working script
ashwinprasadme Nov 27, 2023
c770528
Updated prompts and refactoring
ashwinprasadme Nov 28, 2023
5ba76fc
OllamaRunner support with multiple log handling
ashwinprasadme Nov 29, 2023
9e897cc
Moving to ChatOpenAI, handle multi-tool results
ashwinprasadme Dec 1, 2023
5c0b082
Handling ollama server status
ashwinprasadme Dec 7, 2023
fc54bfc
Adding timeouts and errors
ashwinprasadme Dec 8, 2023
8f8808f
WIP Prompt termination
ashwinprasadme Dec 10, 2023
1f6e0c8
Multiprocess terminate
ashwinprasadme Dec 10, 2023
6ccf3dc
Questions based prompt and response translator
ashwinprasadme Dec 10, 2023
043cd1a
LLMs | Training set and jsonl for fine tuning
Samzcodez Dec 10, 2023
a4ec08e
Fixed results scripts
ashwinprasadme Dec 11, 2023
cb1055c
Minor path fix
ashwinprasadme Dec 11, 2023
67eafaf
Translation fixes
ashwinprasadme Dec 11, 2023
30817d6
Adding another questions based prompt
ashwinprasadme Dec 11, 2023
3deda69
Minor fix
ashwinprasadme Dec 11, 2023
a026677
Translator and added prompts
ashwinprasadme Dec 11, 2023
9cd7a26
Prompt finalized
ashwinprasadme Dec 12, 2023
d2b26be
Minor fix questions
ashwinprasadme Dec 12, 2023
967989d
Refactor finetuning folders
ashwinprasadme Dec 13, 2023
26f86e8
Fix P&R Calculations in edge cases
ashwinprasadme Dec 13, 2023
dceec00
Fix P&R Minor
ashwinprasadme Dec 13, 2023
45707ef
Finetuning dataset v1
ashwinprasadme Dec 13, 2023
ec08c67
Fine tuning datasets
ashwinprasadme Dec 14, 2023
c631f59
Init llama fine tuning
ashwinprasadme Dec 14, 2023
a5a0242
Multi model fine-tuning
ashwinprasadme Dec 14, 2023
65de7e2
Minor Multi-model FT
ashwinprasadme Dec 14, 2023
186a6a8
Minor error handling
ashwinprasadme Dec 14, 2023
c96b32a
Check for col_offset
ashwinprasadme Dec 15, 2023
355f839
[WIP] Auto-generation of finetuning dataset
ashwinprasadme Dec 15, 2023
96b2141
llama finetuning script
ashwinprasadme Dec 15, 2023
29b7e02
Auto-generation of finetuning dataset | More training templates
Samzcodez Dec 16, 2023
54de195
Error check to auto-generated finetuning dataset | Tuple, exception …
Samzcodez Dec 16, 2023
f8066f4
[WIP] Auto-gen microbench structure
ashwinprasadme Dec 17, 2023
7d0ed8d
Example test case
ashwinprasadme Dec 17, 2023
b5bfbc3
QB4 Prompt
ashwinprasadme Dec 17, 2023
eca1a97
QB4 related fixes
ashwinprasadme Dec 17, 2023
0094560
[WIP] QB4
ashwinprasadme Dec 18, 2023
842a50d
Removing code comments from prompts
ashwinprasadme Dec 18, 2023
d0f3a36
Fix missing files, qb4 prompt, v5 finetuning
ashwinprasadme Dec 18, 2023
d26946a
Auto-generation of finetuning dataset | Bug fixes
Samzcodez Dec 18, 2023
4bef65a
Example test case | Typo fixes
Samzcodez Dec 18, 2023
4922ea3
Autogen bug fixes
ashwinprasadme Dec 19, 2023
10c2f75
QB2 finetuning
ashwinprasadme Dec 19, 2023
6a2b13b
[WIP] Finetuning script
ashwinprasadme Dec 19, 2023
6b2f61f
[WIP] Minor
ashwinprasadme Dec 21, 2023
cca6b94
WIP Minor
ashwinprasadme Dec 21, 2023
7fe80f1
Autogen dataset | Analysis sensitivities
Samzcodez Dec 21, 2023
351202d
WIP runner dump
ashwinprasadme Dec 23, 2023
599f3ea
Init callsites scripts
ashwinprasadme Dec 29, 2023
509f763
CS: Micro-bench results script
ashwinprasadme Dec 30, 2023
69e9e32
[WIP] Callgraph scripts
ashwinprasadme Dec 30, 2023
e0aa78f
callgraphs script
ashwinprasadme Dec 31, 2023
a095728
Callsites prompt update, temp change
ashwinprasadme Dec 31, 2023
a3fbc22
callgraphs prompt updates
ashwinprasadme Dec 31, 2023
95d523e
Autogen dataset | Python Features | args
Samzcodez Jan 2, 2024
c26c84f
Autogen dataset | | remove imports in args | assignments
Samzcodez Jan 2, 2024
1190f77
Autogen dataset | assignments
Samzcodez Jan 2, 2024
c56e46c
WIP
ashwinprasadme Jan 3, 2024
e3d5a23
Record run times
ashwinprasadme Jan 3, 2024
0ce5fbe
Minor naming
ashwinprasadme Jan 3, 2024
e70a2b7
Minor fix
ashwinprasadme Jan 4, 2024
8e82528
Minor fix to handle cases where results dont exist
ashwinprasadme Jan 5, 2024
ea55026
Fix dependency error
ashwinprasadme Jan 5, 2024
709d781
Finetuning datasets for Call graph and Call site
Samzcodez Jan 6, 2024
03ac2d0
Finetuning datasets for Call graph and Call site | more snippets
Samzcodez Jan 6, 2024
e9db980
Fine tuning helper scripts
ashwinprasadme Jan 14, 2024
d28a38e
update documentations and main runner
sapkotaruz11 Jan 16, 2024
bd30815
Update README.md
ashwinprasadme Jan 17, 2024
747b17a
Refactor folders
ashwinprasadme Feb 24, 2024
59f83ae
Update README LLMs
ashwinprasadme Feb 27, 2024
4bdf6cc
Merge branch 'main' of https://github.com/sapkotaruz11/TypeEvalPy int…
ashwinprasadme Feb 27, 2024
f0c1cca
Refactoring
ashwinprasadme Feb 27, 2024
98bb7e2
Minor doc links
ashwinprasadme Feb 27, 2024
ca55bae
Merge pull request #4 from secure-software-engineering/sapkotaruz11-main
ashwinprasadme Feb 27, 2024
5120ce6
Update Readme
ashwinprasadme Apr 10, 2024
ba61b96
Update Readme
ashwinprasadme Apr 10, 2024
c912cab
Update main_runner.py
ashwinprasadme Apr 21, 2024
12303e3
Merge branch 'main' into LLMs_MicroBench
ashwinprasadme Jun 25, 2024
53a3d0a
Autogen scripts
ashwinprasadme Jul 10, 2024
b207b46
Microbench fix
ashwinprasadme Jul 10, 2024
7ff724e
Refactor
ashwinprasadme Jul 10, 2024
802b08e
[Done] Builtins
ashwinprasadme Jul 10, 2024
29bed42
Handle complex types fact generation
ashwinprasadme Jul 12, 2024
830f8c2
[Done] Classes
ashwinprasadme Jul 12, 2024
24b2982
[Done] decorators, dicts, direct_calls, dynamic
ashwinprasadme Jul 15, 2024
3f794c4
[Done] functions
ashwinprasadme Jul 15, 2024
cb5ce85
[Done] kwargs, lambdas
ashwinprasadme Jul 16, 2024
e643d8b
[Done] generators, lists, mro, returns
ashwinprasadme Jul 16, 2024
5807d24
Bug fixes
ashwinprasadme Jul 16, 2024
5f94be0
Benchmark minor fix
ashwinprasadme Jul 19, 2024
892d4fc
Imported code handling, ground truth
ashwinprasadme Jul 19, 2024
3216f9b
[Done] imports ground truth
ashwinprasadme Jul 19, 2024
0172d5f
Minor Fixes
ashwinprasadme Jul 22, 2024
1f52441
Minor tweaks
ashwinprasadme Jul 22, 2024
78b05e5
Autogen Benchmark V1
ashwinprasadme Jul 22, 2024
6ab7dd7
Merge pull request #5 from secure-software-engineering/LLMs_MicroBench
ashwinprasadme Jul 22, 2024
817a1b0
Refactoring
ashwinprasadme Jul 23, 2024
335a156
Autogen benchmark
ashwinprasadme Jul 23, 2024
ac512a6
Minor fixes
ashwinprasadme Jul 23, 2024
89537c7
Run on custom benchmark
ashwinprasadme Jul 23, 2024
296fb23
Bug fixes, update headergen version
ashwinprasadme Jul 23, 2024
6b32298
Minor
ashwinprasadme Aug 1, 2024
89ffa63
Autogen Bug fix for imported facts
ashwinprasadme Aug 4, 2024
1e5a62a
LLMs Init
akshitad11 Aug 12, 2024
eacb53f
Init vLLM working
ashwinprasadme Aug 13, 2024
5319553
Integrate vllm and transformers
ashwinprasadme Aug 20, 2024
31429cf
Refactoring, batch mode for transformers
ashwinprasadme Aug 22, 2024
d48fe08
Process prompts directly, model config updated init working
ashwinprasadme Aug 27, 2024
7772eb8
Manual batching to track progress
ashwinprasadme Aug 28, 2024
238c787
Model config modified, LLMRunner
ashwinprasadme Aug 28, 2024
c1276e8
llms runner docker updates
ashwinprasadme Aug 29, 2024
369ab71
Minor
ashwinprasadme Aug 30, 2024
c169a1b
Add relative path to prompts
ashwinprasadme Aug 30, 2024
8e0010d
Changed batch_size
ashwinprasadme Aug 30, 2024
e6e6c9f
HeaderGen version bump
ashwinprasadme Sep 3, 2024
8afe904
Refactor dataset
ashwinprasadme Sep 4, 2024
7716ca2
Added jsonl dump and costing
ashwinprasadme Sep 4, 2024
7b274c8
FIxes for batch prompt creation
ashwinprasadme Sep 5, 2024
4c77a7c
Prepare dataset
ashwinprasadme Sep 14, 2024
0c45781
Minor cleanup
ashwinprasadme Oct 22, 2024
2979a5b
Merge pull request #6 from secure-software-engineering/vllms
ashwinprasadme Oct 22, 2024
e2087de
Update Leaderboard
ashwinprasadme Oct 22, 2024
ed88367
setup real world dataset and testing for llms
rashidabhar Oct 24, 2024
3232cc9
added code annotator to annotate source_code with MASK keyword
rashidabhar Oct 26, 2024
75b8279
update annotation script to work with self.variable and dictionary key
rashidabhar Oct 26, 2024
4c42b93
completed annotating micro-benchmark
rashidabhar Oct 28, 2024
4b2e2a4
updated annotation script
rashidabhar Oct 28, 2024
b024771
annotated existing file instead of creating new annotated file
rashidabhar Oct 29, 2024
bf65d23
updated annotation script to force already annotated types
rashidabhar Oct 29, 2024
3b8cc55
translate annotated file into typeeval gt format
rashidabhar Oct 31, 2024
4fc30b8
added test script to check the annotations
rashidabhar Nov 5, 2024
48407d8
update test annotation script
rashidabhar Nov 6, 2024
49e0032
update annotator script for keywords only param
rashidabhar Nov 7, 2024
6ef201b
use masked source code prompt to get the annotated code
rashidabhar Nov 7, 2024
90933c7
updated annotation script and result translator
rashidabhar Nov 10, 2024
a209480
updated prompt template
rashidabhar Nov 12, 2024
e22e88d
update prompt for masking language modeling
ashwinprasadme Nov 14, 2024
71e3ef6
updated prompts design , code annotator script and added new LLMs to …
ashwinprasadme Nov 22, 2024
2fa67c3
updated get prompt method to work on question based as well as masked…
ashwinprasadme Nov 28, 2024
60ed1be
setup runner for real world dataset
ashwinprasadme Dec 7, 2024
1bb5cb7
updated logging for memory caching and batch processing
ashwinprasadme Dec 18, 2024
d83ab32
updated pipeline for real-world llm
ashwinprasadme Jan 7, 2025
891f4d7
Result analysis for large files
ashwinprasadme Jan 13, 2025
c832373
Add indexing
ashwinprasadme Jan 13, 2025
ab4788f
Merge pull request #8 from secure-software-engineering/main
rashidabhar Jan 18, 2025
fee9211
updated pipeline
ashwinprasadme Jan 18, 2025
b4ca5b2
updated runners for real world dataset
ashwinprasadme Feb 25, 2025
bf798ca
updated latest version
ashwinprasadme Apr 16, 2025
95c847e
updated runners
ashwinprasadme Apr 27, 2025
cfdc8aa
added datapreprocessing steps
rashidabhar Apr 29, 2025
43c7c72
documented code and added readme.md
rashidabhar Apr 29, 2025
15e4781
added readme
rashidabhar Apr 29, 2025
378a0d7
added finetuning file
rashidabhar Apr 29, 2025
1b22072
Merge branch 'main' of https://github.com/secure-software-engineering…
rashidabhar Apr 29, 2025
903e8bf
delete unnecessary files
rashidabhar Apr 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -180,4 +180,4 @@ src/target_tools/ollama/src/fine_tuning/wandb/*
src/target_tools/ollama/src/fine_tuning/outputs/*

# Ignore autogen files
autogen/data
autogen/data
122 changes: 118 additions & 4 deletions src/result_analyzer/analysis_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ def format_type(_types, is_ml=False):
for _type in _types:
i_type_list = []
if is_ml:
if _type.startswith("Union["):
if is_ml and _type.startswith("Union["):
# TODO: Improve code, should not lower() for all. e.g., MyClass
types_split = [
x.replace(" ", "").lower()
Expand All @@ -124,15 +124,31 @@ def format_type(_types, is_ml=False):
# i_type_list.append(_t.split("[")[0].lower())
else:
for _t in _type:
if _t.startswith("Union["):
if _t and _t.startswith("Union["):
types_split = [
x.replace(" ", "").lower()
for x in _t.split("Union[")[1].split("]")[0].split(",")
]
i_type_list.extend(types_split)
elif _t and _t.startswith("Optional["):
types_split = [
x.replace(" ", "").lower()
for x in _t.split("Optional[")[1].split("]")[0].split(",")
]
types_split.append("Nonetype")
i_type_list.extend(types_split)
elif _t and _t.startswith("Type["):
types_split = [
x.replace(" ", "").lower()
for x in _t.split("Type[")[1].split("]")[0].split(",")
]
i_type_list.extend(types_split)
elif _t and _t in ["None", "Unknown"]:
i_type_list.append("Nonetype")
else:
# TODO: Maybe no translation should be done here
i_type_list.append(_t.lower())
if _t:
i_type_list.append(_t.lower())
# i_type_list.append(_t.split("[")[0].lower())
type_formatted.append(list(set(i_type_list)))

Expand Down Expand Up @@ -176,10 +192,14 @@ def check_match(
if expected.get("file") != out.get("file"):
return False

# check if line_number match
# # check if line_number match
if expected.get("line_number") != out.get("line_number"):
return False

# if "col_offset" in expected and "col_offset" in out:
if expected["col_offset"] != out["col_offset"]:
return False

if "col_offset" in expected and "col_offset" in out:
if expected["col_offset"] != out["col_offset"]:
return False
Expand Down Expand Up @@ -658,3 +678,97 @@ def benchmark_count(benchmark_path):
_a, _functions, _params, _variables = get_fact_stats(json_files)
total_result.append([cat, _a, _functions, _params, _variables])
return total_result


def normalize_type(type_str, nested_level=0):
"""
Normalize the type string by removing module prefixes and simplifying typing constructs.
Example: 'builtins.str' -> 'str',
'typing.Tuple[builtins.str, builtins.float]' -> 'Tuple[str, float]',
'musictaxonomy.spotify.models.spotifyuser' -> 'SpotifyUser',
'List[List[Tuple[str]]]' -> 'List[List[Any]]' if nested level > 2.
"""

if type_str is None:
return None

# Remove extra quotes if present
if type_str.startswith('"') and type_str.endswith('"'):
type_str = type_str.strip('"')

# Mapping of module prefixes to remove
type_mappings = {
"builtins.": "",
"typing.": "",
}
# Additional type mappings
additional_type_mappings = {
"integer": "int",
"string": "str",
"dictonary": "dict",
"method": "Callable",
"func": "Callable",
"function": "Callable",
"none": "None",
"Nonetype": "None",
"nonetype": "None",
"NoneType": "None",
"Text": "str",
}

if type_str is None:
return None

# Replace module prefixes
for prefix, replacement in type_mappings.items():
type_str = type_str.replace(prefix, replacement)

# Apply additional type mappings
type_str = additional_type_mappings.get(type_str, type_str)

# Handle generic types (e.g., Tuple[], List[], Dict[])
if "[" in type_str and "]" in type_str:
base_type, generic_content = type_str.split("[", 1)
generic_content = generic_content.rsplit("]", 1)[0]
# Process the generic parameters recursively
generic_params = []
bracket_level = 0
param = ""
for char in generic_content:
if char == "[":
bracket_level += 1
param += char
elif char == "]":
bracket_level -= 1
param += char
elif char == "," and bracket_level == 0:
generic_params.append(param.strip())
param = ""
else:
param += char
if param:
generic_params.append(param.strip())

# If nested level is greater than 0, replace with Any
if nested_level > 0:
normalized_params = ["Any"]
else:
normalized_params = [
normalize_type(param, nested_level + 1) for param in generic_params
]

return f"{base_type}[{', '.join(normalized_params)}]"

# Handle fully qualified names by extracting the last segment
if "." in type_str:
return type_str.split(".")[-1]

# Return the simplified type
return type_str


def normalize_types(types):
"""
Normalize the type strings in the data.
"""
return [normalize_type(type_str) for type_str in types]
Loading