sierra logo

Project Details

Technologies Used

Languages

The languages on the right were used to program Sierra. Sierra is primarily programed using Python with some aspects, such as the GUI, using CSS, HTML, and JavaScript.

Repository

We used GitHub as our repository.

Virtual Machines

We used Oracle VM VirtualBox to create virtual machines on our local computers to test Sierra. Later in the semester, Henry hosted several virtual machines on his server and we used Remote Desktop Connection to connect to those virtual machines.


Python 68.2%
CSS 15.2%
HTML 9.2%
JavaScript 7.4%

Framework

Framework Overview

Sierra's framework is made up of three areas, split into 6 components. These three areas are Enemy Network, Data Exfiltration, and Blue Network. Information is acquired from the enemy within their network via the Sierra Client. The Sierra Client spies on the Enemy Machine and temporarily stores data in the Transient Encrypted Database. Browsing the internet triggers the Sierra Client to exfiltrate data via both Reddit and Google Drive. The Sierra Server, residing within the Blue Network, will access the communication channel and retrieve the intelligence. It would then parse it and store it in the Permanent Encrypted Database. When investigators want to look at the acquired data, they can access it through the GUI, where the intelligence will be shown in an easy-to-read format. Should investigator need to, they may send commands to Sierra Client over the communication channel.


Framework: Data Exfiltration

Data Exfiltration

In our research, we identified that commercial spyware generally relies on end-to-end communication with a central server for data exfiltration. This form of data exfiltration has proved to be trackable. We are masking our data transfer through the use of intermediate sites, instead of a direct connection to an external server. This allows Sierra Server to avoid being tracked. Instead of being accessible from the internet, Sierra Server will lay dormant until the investigators request it to connect to the online accounts for data retrieval.

Sierra Client also uses common websites to exfiltrate data to avoid detection while operating within the enemy network. This way, if the user attempts to find malicious beaconing behavior on their machine through network analysis, Sierra will be obfuscated among ordinary web browsing--bypassing detection. The transfer method would also theoretically bypass heuristic-based intrusion detection software, as well as signature-based intrusion detection software, because the behavior and signature is that of an ordinary user browsing a popular website. This is inherently non-malicious behavior and should not be flagged, unlike other implementations we have found in our research.

Framework: Enemy Network

Enemy Network

Transient Encryption Database

The exfiltration methodology described above requires Sierra to be able to hold data for unknown periods of time. The user may not browse the internet for a few hours, days, or even weeks. It could be that Sierra acquires all the intelligence it could on the user, but the user turns the machine off immediately after. Because Sierra needs to hold collected data in an encrypted format on nonvolatile storage, we created the Transient Encrypted Database. It is a relational database that stores the intelligence Sierra acquires. As information is being acquired, the database will be held in memory. Afterward, the database is encrypted with AES-256 and written to a local directory where the Sierra binary is held. If the user shuts down their computer, Sierra can still load the database, decrypt it, and, when possible, exfiltrate it. If Sierra has completed intelligence acquisition but the user has not yet gone online, Sierra will lay dormant with the database safe on nonvolatile storage. When the user browses the internet, Sierra also connects to that site and begins exfiltrating the database. When the entire database is sent, Sierra will remove the Transient Encrypted Database from the disk.

Sierra Client

From the target user's machine, the Sierra Client exfiltrates Discord chat logs, Discord voice calls, keylogs, screenshots, webcam images, web history, cookies, passwords, and Gmail emails. As data is captured, it is stored on the transient encrypted database described above, waiting to be exfiltrated through Google Drive and Reddit.

Framework: Blue Network

Blue Network

Permanent Encryption Database

The Transient and Permanent Database are nearly the same; however, unlike the Transient Database, this database will store all data consolidated into one database. It will be encrypted by an AES key known to the end user. This AES key will be used to login the user to the server.

GUI for Investigators

Sierra will be capturing a massive amount of data, from Discord communications to images, audio, and internet traffic. Given the quantity and diversity in the type of data collected, Sierra needs a way to orderly present information, allowing investigators to easily view data. The end user for Sierra Server is an investigator, likely affiliated with law enforcement, attempting to prevent corporate espionage or similar situations. Because Sierra Server is not meant to be used by a cyber operator, Sierra Server has an easy to use graphical user interface. Flask, a web server package for Python, is used to accomplish this. Sierra Server will retrieve and parse data from the server side database then display it on a GUI for the investigators to monitor. Ensuring the end user has a straightforward and clean dashboard UI is important in the development of this software.

Sierra Server

The Sierra Server periodically gathers information sent by the Sierra Client from Reddit and Google Drive. This information is encrypted. Sierra Server decrypts the data, parses it, and consolidates it into the permanent database. The parsed data is available for investigators to view through a GUI. The Sierra Server will also be able to send commands to Sierra Client via the aforementioned communication link. This is done to ensure that investigators can request more intelligence, as needed.

Skills in Action

Through this project, we conducted secondary research to find existing information on spyware and cyber security. Next, we learned project management tools, such as Jira, and divided up tasks. We learned to set up virtual environments for testing Sierra's individual functionalities. We researched tools and libraries within Python and learned to use them within our project. Since most of the team did not have experience with cyber security, we learned most of the functionalities we implemented from scratch. We learned how to perform network analysis, MITM attacks, cryptography, and more. We learned to consolidate all of the functionalities into one program and work with each other's code. Finally, we documented the work we had completed and presented our initial goals, results and findings in a written format and through a live presentation.