|
--- |
|
title: Solr Normalization Demo |
|
emoji: 🔥 |
|
colorFrom: blue |
|
colorTo: indigo |
|
sdk: docker |
|
pinned: false |
|
short_description: Text normalization demo for Impresso project |
|
--- |
|
|
|
# Solr Normalization Demo |
|
|
|
This space demonstrates how text is normalized in the **Impresso** project, replicating Solr's text processing functionality. |
|
|
|
Solr normalization is meant to demonstrate how text is normalized in the Impresso project. The pipeline processes text through various analyzers including tokenization, stopword removal, and language-specific transformations to prepare text for search and analysis. |
|
|
|
## Features |
|
- Multi-language support (German, French, Spanish, Italian, Portuguese, Dutch, English) |
|
- Automatic language detection |
|
- Detailed analyzer pipeline visualization |
|
- Stopword detection and removal |
|
- Token normalization |
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|