-
-
Notifications
You must be signed in to change notification settings - Fork 270
Migrate Alpine importer to advisory V2 #2111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
|
…aseImporterPipelineV2 Signed-off-by: ziad hany <ziadhany2016@gmail.com>
|
@TG1999 @pombredanne I have a question about Alpine migration. We are fetching one URL and processing the data without grouping by CVE. The problem is that each URL reports a package version along with its fixed CVEs. How can we obtain a unique identifier for this importer? Is it a good idea to restructure the data and create a large mapping, using the CVE as the unique identifier? Proposed structure: Example: Sources: |
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
| ) | ||
|
|
||
| for cve in aliases: | ||
| advisory_id = f"{pkg_infos['name']}/{qualifiers['distroversion']}/{cve}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ex:
apache2/v3.20/2.4.26-r0/CVE-2017-7668
Fix duplication on advisory_id Signed-off-by: ziad hany <ziadhany2016@gmail.com>
vulnerabilities/tests/pipelines/v2_importers/test_alpine_linux_importer_pipeline.py
Show resolved
Hide resolved
|
The logs in debug mode: |
keshav-space
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ziadhany, see comments below.
| def fetch_advisory_directory_links( | ||
| page_response_content: str, | ||
| base_url: str, | ||
| logger: callable = None, | ||
| ) -> List[str]: | ||
| """ | ||
| Return a list of advisory directory links present in `page_response_content` html string | ||
| """ | ||
| index_page = BeautifulSoup(page_response_content, features="lxml") | ||
| alpine_versions = [ | ||
| link.text | ||
| for link in index_page.find_all("a") | ||
| if link.text.startswith("v") or link.text.startswith("edge") | ||
| ] | ||
|
|
||
| if not alpine_versions: | ||
| if logger: | ||
| logger( | ||
| f"No versions found in {base_url!r}", | ||
| level=logging.DEBUG, | ||
| ) | ||
| return [] | ||
|
|
||
| advisory_directory_links = [urljoin(base_url, version) for version in alpine_versions] | ||
|
|
||
| return advisory_directory_links | ||
|
|
||
|
|
||
| def fetch_advisory_links( | ||
| advisory_directory_page: str, | ||
| advisory_directory_link: str, | ||
| logger: callable = None, | ||
| ) -> Iterable[str]: | ||
| """ | ||
| Yield json file urls present in `advisory_directory_page` | ||
| """ | ||
| advisory_directory_page = BeautifulSoup(advisory_directory_page, features="lxml") | ||
| anchor_tags = advisory_directory_page.find_all("a") | ||
| if not anchor_tags: | ||
| if logger: | ||
| logger( | ||
| f"No anchor tags found in {advisory_directory_link!r}", | ||
| level=logging.DEBUG, | ||
| ) | ||
| return iter([]) | ||
| for anchor_tag in anchor_tags: | ||
| if anchor_tag.text.endswith("json"): | ||
| yield urljoin(advisory_directory_link, anchor_tag.text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ziadhany this is bit brittle. I've created a mirror for Alpine secdb here https://github.com/aboutcode-org/aboutcode-mirror-alpine-secdb let's use this instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I’ll update the code. I didn’t notice we have a mirror
| return (cls.collect_and_store_advisories,) | ||
|
|
||
| def advisories_count(self) -> int: | ||
| return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's return count based on packages key.
Issue: