Hello,
As many of you may already be aware, I serve as a mentor for the proposed Fedora GSoC project titled 'AI-Enabled Triager and Security Alert Aggregator.' I currently work as a Senior Principal Product Security Engineer at Red Hat, focusing on AI security, safety, and trustworthiness.
I recently joined the list and it seems, there were some questions asked about my proposed project, which I would like to answer here:
Question from Ayush Jayaswal: Regarding the intelligence of the tool: The idea is to either use a pre-trained small model from huggingface or use a machine learning method to create a solution, where given enough logs, the system is able to figure out 1. The general health of the system, for example if it seems like frequent write errors it should be able to predict that given the fact that the errors are increasing at a linear rate everyday, the hard drive may fail in the next 6 months and therefore you should back up as soon as possible. And similar other scenarios for security issues also. For this project, we should plan to limit to around 5 or maximum 10 such scenarios, but with training the system can be extended. Regarding performance issues: The idea is to run the tool once a day or once in two days. The tool ingests logs from the point of last ingestion and does not have to be always on. The user will have an option to keep it on and just run them on security logs, but again its configurable.
All communication needs to be done on this mailing list and on https://chat.fedoraproject.org/#/room/#google-summer-coding:fedora.im
Questions from Aazam Thakur: I think some of the above should already answer your question. The idea here is not to resolve the "pain" in triaging logs, but to gain "intelligent" insights from our logs. Also as an extension the system suggests possible remediations as well.
Thank you and let me know (on this list or matrix) if you have any questions.
Hi Huzaifa,
Thanks for your response,
On Mon, Mar 3, 2025 at 10:31 AM Huzaifa Sidhpurwala via Fedora Summer Coding community summer-coding@lists.fedoraproject.org wrote:
Question from Ayush Jayaswal: Regarding the intelligence of the tool: The idea is to either use a pre-trained small model from huggingface or use a machine learning method to create a solution, where given enough logs, the system is able to figure out 1. The general health of the system, for example if it seems like frequent write errors it should be able to predict that given the fact that the errors are increasing at a linear rate everyday, the hard drive may fail in the next 6 months and therefore you should back up as soon as possible. And similar other scenarios for security issues also. For this project, we should plan to limit to around 5 or maximum 10 such scenarios, but with training the system can be extended.
Makes sense. Also, If you have any recommended articles or resources on possible scenarios, I’d love to check them out.
I have got some ideas but I'll prefer having your suggestions too.
Regarding performance issues: The idea is to run the tool once a day
or once in two days. The tool ingests logs from the point of last ingestion and does not have to be always on. The user will have an option to keep it on and just run them on security logs, but again its configurable.
Ok, thanks for the clarification!
All communication needs to be done on this mailing list and on https://chat.fedoraproject.org/#/room/#google-summer-coding:fedora.im
Cool! I'll continue this conversation on the matrix server.
Thanks, Ayush Jayaswal
summer-coding@lists.fedoraproject.org