Installation#

Requirements#

Row2Vec requires Python 3.10 or higher and depends on:

  • pandas >= 1.5.3

  • scikit-learn >= 1.0.0

  • tensorflow >= 2.8.0

  • numpy >= 1.21.0

  • umap-learn >= 0.5.0

  • click >= 8.0.0

  • rich >= 12.0.0

  • pyyaml >= 6.0.0

Install from PyPI#

The simplest way to install Row2Vec:

pip install row2vec

Install from Source#

For the latest development version:

git clone https://github.com/evotext/row2vec.git
cd row2vec
pip install -e .

Install with Development Dependencies#

If you want to contribute or run tests:

git clone https://github.com/evotext/row2vec.git
cd row2vec
pip install -e ".[dev]"

This includes:

  • pytest for testing

  • ruff for linting and formatting

  • mypy for type checking

  • jupyter for notebook development

Optional Dependencies#

Row2Vec now includes all major dependencies by default. Optional extras:

For development work:#

pip install "row2vec[dev]"

For documentation building:#

pip install "row2vec[docs]"

For data format support:#

pip install pyarrow  # For Parquet files
pip install openpyxl  # For Excel files

Verify Installation#

# Import complete suppression first
exec(open('suppress_minimal.py').read())

import row2vec
print(f"Row2Vec version: {row2vec.__version__}")

# Check available modes
from row2vec import learn_embedding
print("\nAvailable embedding modes:")
print("- unsupervised (neural network autoencoder)")
print("- target (supervised categorical embeddings)")
print("- pca (Principal Component Analysis)")
print("- tsne (t-Distributed Stochastic Neighbor Embedding)")
print("- umap (Uniform Manifold Approximation and Projection)")
✓ Enhanced minimal suppression active
Row2Vec version: 0.1.0

Available embedding modes:
- unsupervised (neural network autoencoder)
- target (supervised categorical embeddings)
- pca (Principal Component Analysis)
- tsne (t-Distributed Stochastic Neighbor Embedding)
- umap (Uniform Manifold Approximation and Projection)

Platform-Specific Notes#

macOS with Apple Silicon#

For optimal performance on M1/M2 Macs:

# Install TensorFlow for Apple Silicon
pip install tensorflow-macos
pip install tensorflow-metal

Linux#

Ensure you have Python development headers:

# Ubuntu/Debian
sudo apt-get install python3-dev

# Fedora/RHEL
sudo dnf install python3-devel

Windows#

We recommend using Anaconda or Miniconda:

conda create -n row2vec python=3.10
conda activate row2vec
pip install row2vec

Troubleshooting#

Import Error: No module named ‘tensorflow’#

TensorFlow might not be installed properly:

pip uninstall tensorflow
pip install --upgrade tensorflow

Memory Issues#

For large datasets, ensure you have sufficient RAM or use sampling:

from row2vec import learn_embedding

# Sample large datasets
embeddings = learn_embedding(
    large_df.sample(n=10000),  # Sample 10k rows
    mode="unsupervised"
)

GPU Support#

To use GPU acceleration:

# Check GPU availability
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

# Install CUDA-enabled TensorFlow
pip install tensorflow[and-cuda]

Next Steps#

Installation complete! Now proceed to the Quickstart guide to learn the basics.