Contributing to Transmog¶
Contributions to Transmog are welcome and appreciated. This guide covers everything needed to get started contributing to the project.
Quick Start¶
Prerequisites¶
Python 3.8 or higher
Git
Development Setup¶
Fork and clone the repository
git clone https://github.com/your-username/transmog.git
cd transmog
Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Install development dependencies
pip install -e ".[dev]"
Install pre-commit hooks
pre-commit install
The pre-commit hooks handle code formatting, linting, and style checks automatically. No manual formatting is needed.
Verify the setup
python -m pytest tests/
Development Workflow¶
Making Changes¶
Create a feature branch
git checkout -b feature/your-feature-name
Make your changes
Write code, add tests, update documentation as needed.
Run tests locally
# Run all tests
python -m pytest
# Run specific test file
python -m pytest tests/test_specific.py
# Run with coverage
python -m pytest --cov=transmog
Commit your changes
Pre-commit hooks will automatically format and check code.
git add .
git commit -m "Add feature: brief description"
Push and create pull request
git push origin feature/your-feature-name
Then create a pull request on GitHub.
Types of Contributions¶
Bug Reports¶
When reporting bugs, include:
Python version and operating system
Transmog version
Minimal code example that reproduces the issue
Expected vs actual behavior
Full error traceback if applicable
Feature Requests¶
For feature requests:
Describe the use case and problem being solved
Provide examples of how the feature would be used
Consider if the feature fits Transmog’s scope and design principles
Code Contributions¶
Good first contributions:
Documentation improvements
Test coverage improvements
Bug fixes with clear reproduction steps
Performance optimizations with benchmarks
Larger contributions:
Data format support
Processing optimizations
API enhancements
Codebase Structure¶
The Transmog codebase is organized into focused, single-responsibility modules:
Core Processing Modules¶
src/transmog/core/
- Core data transformation logicflattener.py
- JSON flattening operationsextractor.py
- Array extraction logichierarchy.py
- Nested structure handlingmetadata.py
- Metadata annotationmemory.py
- Memory optimization utilities
src/transmog/process/strategies/
- Processing strategy implementationsbase.py
- Base strategy class and common utilitiesshared.py
- Shared batch processing logicmemory.py
,file.py
,batch.py
,chunked.py
,csv.py
- Specific strategies
src/transmog/process/result/
- Result handling and outputcore.py
- Core result functionalityconverters.py
- Format conversion methodswriters.py
- File writing operationsstreaming.py
- Streaming functionality
Configuration and Validation¶
src/transmog/config/
- Configuration managementsrc/transmog/validation.py
- Unified parameter validation systemsrc/transmog/error/
- Error handling and recovery strategies
This modular structure makes it easier to:
Navigate and understand specific functionality
Write focused tests for individual components
Make changes without affecting unrelated code
Maintain and extend the codebase
Development Standards¶
Code Style¶
Code style is automatically enforced by pre-commit hooks. The setup includes:
Black for code formatting
isort for import sorting
flake8 for linting
mypy for type checking
No manual formatting is required - just commit changes and the hooks handle the rest.
Testing¶
Write tests for added functionality
Maintain or improve test coverage
Use descriptive test names that explain what is being tested
Include both unit tests and integration tests where appropriate
Test Structure¶
def test_flatten_nested_objects():
"""Test that nested objects are properly flattened."""
data = {"user": {"name": "Alice", "age": 30}}
result = tm.flatten(data)
assert len(result.main) == 1
assert result.main[0]["user_name"] == "Alice"
assert result.main[0]["user_age"] == 30
Running Tests¶
# Run all tests
pytest
# Run with coverage report
pytest --cov=transmog --cov-report=html
# Run specific test categories
pytest tests/unit/
pytest tests/integration/
pytest tests/performance/
Documentation¶
Update docstrings for functions and classes
Add examples to demonstrate usage
Update relevant documentation files in
docs/
Ensure examples are runnable and accurate
API Design Principles¶
When contributing code changes, follow these principles:
Simplicity First¶
The main API should remain simple and intuitive:
# Good: Simple, clear API
result = tm.flatten(data)
# Avoid: Overly complex APIs
result = tm.flatten(data, complex_config=ComplexConfigObject())
Sensible Defaults¶
Parameters should have sensible defaults for common use cases:
# Most users should get good results with defaults
result = tm.flatten(data)
# Advanced users can customize as needed
result = tm.flatten(data, arrays="inline", batch_size=5000)
Backward Compatibility¶
Avoid breaking changes to existing APIs
Deprecate features before removing them
Provide migration paths for major changes
Performance Considerations¶
Optimize for common use cases
Provide memory-efficient options for large datasets
Include benchmarks for performance-critical changes
Pull Request Process¶
Before Submitting¶
Tests pass locally
Code follows project conventions (enforced by pre-commit)
Documentation is updated if needed
Changes are focused and atomic
Commit messages are clear and descriptive
Pull Request Template¶
When creating a pull request, include:
Description Brief description of changes and motivation.
Type of Change¶
Bug fix
New feature
Documentation update
Performance improvement
Code refactoring
Testing¶
Tests added/updated
All tests pass
Manual testing performed
Documentation¶
Documentation updated
Examples added/updated
API documentation updated
Review Process¶
Automated Checks: CI runs tests and checks
Code Review: Maintainers review the code
Discussion: Address feedback and questions
Approval: Maintainer approves the changes
Merge: Changes are merged to main branch
Advanced Topics¶
Architecture Overview¶
Transmog follows a modular design with several core components:
Processor: Main entry point for users, orchestrates transformation
Core Transformation: Flattener and Extractor for data transformation
Configuration System: Hierarchical configuration with factory methods
I/O System: Handles reading input and writing output in various formats
Error Handling: Configurable strategies for dealing with errors
Extension Points¶
Transmog provides several extension points for customization:
Custom Recovery Strategies:
from transmog.error import RecoveryStrategy class MyRecoveryStrategy(RecoveryStrategy): def recover(self, error, context=None): # Custom recovery logic return recovery_result
Custom ID Generation:
def custom_id_strategy(record): # Generate ID based on record contents return f"CUSTOM-{record.get('id', 'unknown')}" # Use with processor processor = Processor.with_custom_id_generation(custom_id_strategy)
Output Format Extensions:
from transmog.io import DataWriter, register_writer class MyCustomWriter(DataWriter): def write(self, data, destination): # Custom writing logic pass # Register the writer register_writer("custom-format", MyCustomWriter)
Performance Testing¶
When making performance-related changes:
Use the benchmarking script:
python scripts/run_benchmarks.py
Compare before and after performance metrics
Include benchmark results in pull request description
Consider memory usage as well as processing speed
Documentation Updates¶
When updating documentation:
Follow the passive voice style guidelines
Avoid personal pronouns (you, we, our, etc)
No temporal language (previously, new version, etc)
Provide complete, runnable examples
Test all code examples for accuracy
Release Process¶
Releases follow semantic versioning:
Patch (1.0.1): Bug fixes and minor improvements
Minor (1.1.0): New features, backward compatible
Major (2.0.0): Breaking changes
Getting Help¶
Documentation: Check the documentation (available after deployment)
Issues: Search existing GitHub issues
Questions: Use GitHub issues for questions and support
Contact: Reach out to maintainers for guidance on larger contributions
Code of Conduct¶
Be respectful, inclusive, and constructive in all interactions. Focus on the code and ideas, not the person. Help create a welcoming environment for all contributors.
Recognition¶
Contributors are recognized in:
Release notes for significant contributions
GitHub contributors list
Project acknowledgments
Thank you for contributing to Transmog!