ProductMatchAlgorithm
Python tool for product matching between catalogs using hybrid TF-IDF algorithm with weighted scoring by colors (with synonyms), categories, stem length, and item numbers.
66.7% authorship, 85%+ matching precision
Quick Stats
Nov 2025 - Jan 2026
4 / 6
My commits / Total
0
Lead Developer
66.7%
Lead Developer
Project Metrics
Period
Nov 2025 - Jan 2026
Role
Lead Developer
Team
2 people
85%+
Precision
<30s
Processing
66.7%
Authorship
+15%
Matches Improved
Tech Stack
Key Features
TF-IDF Matching
Name vectorization with cosine similarity
Color Scoring
Token overlap + automatic synonym expansion
Category Match
Fuzzy matching >=90% between categories
Stem Length Groups
Matching by groups: <50cm, 50-60cm, >60cm
Item Number Overlap
Token overlap between variety and item number
Excel Output
Best Match + Top Matches with neighborhood +-0.03
My Contribution
Role
Lead Developer
Key Contributions
- Hybrid matching algorithm design
- TF-IDF implementation with scikit-learn
- Multi-component weighted scoring system
- Text normalization and synonym expansion
- Excel report generation with XlsxWriter
Achievements
85%+
Precision
<30s
Processing
66.7%
Authorship
+15%
Matches Improved