WEB
app
professional

ProductMatchAlgorithm

Python tool for product matching between catalogs using hybrid TF-IDF algorithm with weighted scoring by colors (with synonyms), categories, stem length, and item numbers.

66.7% authorship, 85%+ matching precision

66.7% contribution

Quick Stats

Period

Nov 2025 - Jan 2026

Commits

4 / 6

My commits / Total

Team Size

0

Lead Developer

Contribution

66.7%

Lead Developer

Project Metrics

0%Contribution
My Commits
4 / 6

Period

Nov 2025 - Jan 2026

Role

Lead Developer

Team

2 people

85%+

Precision

<30s

Processing

66.7%

Authorship

+15%

Matches Improved

Tech Stack

Python 3.8+
scikit-learn
Pandas
NumPy
XlsxWriter
tqdm
argparse

Key Features

TF-IDF Matching

Name vectorization with cosine similarity

Color Scoring

Token overlap + automatic synonym expansion

Category Match

Fuzzy matching >=90% between categories

Stem Length Groups

Matching by groups: <50cm, 50-60cm, >60cm

Item Number Overlap

Token overlap between variety and item number

Excel Output

Best Match + Top Matches with neighborhood +-0.03

My Contribution

0.0%Contribution

Role

Lead Developer

Key Contributions

  • Hybrid matching algorithm design
  • TF-IDF implementation with scikit-learn
  • Multi-component weighted scoring system
  • Text normalization and synonym expansion
  • Excel report generation with XlsxWriter

Achievements

85%+

Precision

<30s

Processing

66.7%

Authorship

+15%

Matches Improved

Challenges Solved