Building an Anomaly Detection Tool

Here we will demonstrate how to use the Artemis Planning Agent to build a command-line tool for anomaly detection in tabular data. The starting point is a simple Python template. Through an iterative process of question answering with our planner, we develop a specific and well-structured plan to build our desired tool with clear validation points and success criteria. The end result is a complete tool with data loading, algorithm implementation, and visualization, along with comprehensive documentation.

Project Overview

Goal: Build a CLI tool that identifies outliers in CSV files using machine learning

Before: Python template (acc6e7a)

After: Complete anomaly detection tool (f8ad1ad)

Target Users: Data scientists and analysts working with numeric datasets

Use Cases:

Fraud detection in financial transactions
Sensor error identification in IoT data
Data quality assessment and validation

Planning Process

We started with a simple statement: "I want to build an anomaly detection tool."

The Artemis Planning Agent helped us think through the requirements by asking clarifying questions:

User Requirements:

What type of data? → General tabular data
Primary use case? → General-purpose tool
What to do with anomalies? → Visualize in dashboard/charts
How to provide data? → Supply path to CSV file
What form should it take? → Command-line tool (CLI)

Technical Requirements:

Specific algorithms? → Machine learning (Isolation Forest, One-Class SVM)
Expected data scale? → Large (hundreds of thousands of rows)
Configurable parameters? → Yes - via command-line flags
How to display charts? → Generate static image files
Both algorithms or one? → Both - run both and compare results

From these answers, Artemis generated a 9-task plan with a clear strategy: Build a working MVP first (tasks 1-6), then enhance it (tasks 7-9). This approach meant we'd have a functional tool using Isolation Forest after 6 tasks, then add the second algorithm and comparison features.

Each task in the plan comes with clear structure: specific goals and deliverables, detailed technical specifications, files to modify, and measurable success criteria. This transforms a vague request into actionable development steps with concrete validation points.

Implementation

We built the tool in two phases: first creating a working MVP, then enhancing it with comparison capabilities.

Phase 1: MVP - Working Anomaly Detection Tool (Tasks 1-6)

These six tasks created a complete, functional tool using Isolation Forest:

1. Set up project dependencies and structure - Established foundation with pandas, scikit-learn, matplotlib, and numpy for data processing, ML algorithms, and visualization. (1f17c19)

2. Implement CSV data loading and preprocessing - Built robust data pipeline with validation, encoding, and scaling optimized for large datasets. (1777042)

3. Implement Isolation Forest anomaly detection - Added first ML algorithm with configurable parameters and optimized performance for hundreds of thousands of rows. (83dfad3)

4. Implement visualization and chart generation - Created comprehensive visualization suite with PCA/t-SNE dimensionality reduction and multi-panel dashboards. (49d49d7)

5. Build CLI interface with argparse - Implemented command-line interface with parameter configuration, validation, and progress feedback. (3c7e222)

6. Integration and pipeline orchestration - Wired all components together into main.py and integrated the complete pipeline with comprehensive error handling. (d078f4a)

Result after Task 6: A fully functional anomaly detection CLI tool that processes CSV files, detects outliers using Isolation Forest, and generates clear visualizations.

Phase 2: Enhancement - Dual Algorithm Comparison (Tasks 7-9)

These three tasks added the second algorithm and comparison capabilities:

7. Implement One-Class SVM anomaly detection - Added second ML algorithm providing complementary detection approach with performance optimization for large-scale data. (acd5d8a)

8. Implement result comparison and aggregation - Built comparison logic with algorithm agreement metrics, consensus detection, and statistical measures of inter-algorithm agreement. (5682b45)

9. Documentation and README - Completed professional documentation with installation instructions, usage examples, algorithm guidance, and troubleshooting. (f8ad1ad)

Final Result: A sophisticated tool that runs both algorithms, compares results, identifies consensus anomalies, and provides agreement metrics showing multi-algorithm consensus.

The Artemis Planning Agent breaks down requirements into manageable tasks and guides you through building each component step by step, from initial template to complete, documented tool.

Project Overview​

Planning Process​

Implementation​

Phase 1: MVP - Working Anomaly Detection Tool (Tasks 1-6)​

Phase 2: Enhancement - Dual Algorithm Comparison (Tasks 7-9)​

Project Overview

Planning Process

Implementation

Phase 1: MVP - Working Anomaly Detection Tool (Tasks 1-6)

Phase 2: Enhancement - Dual Algorithm Comparison (Tasks 7-9)