Spaces:
Running
Running
Prepare Reddit Scraper for Hugging Face deployment
Browse files- .env.template +8 -6
- .gitignore +15 -23
- .streamlit/secrets.toml.template +12 -0
- README-HF.md +69 -0
- advanced_scraper_ui.py +41 -4
- app.py +38 -0
- enhanced_scraper.py +38 -18
- packages.txt +4 -0
- requirements.txt +1 -0
- setup_for_hf.sh +107 -0
.env.template
CHANGED
@@ -1,10 +1,12 @@
|
|
1 |
# Reddit API Credentials
|
|
|
|
|
|
|
|
|
2 |
REDDIT_CLIENT_ID=your_client_id_here
|
|
|
|
|
3 |
REDDIT_CLIENT_SECRET=your_client_secret_here
|
4 |
-
REDDIT_USER_AGENT=your_user_agent_here
|
5 |
-
REDDIT_USERNAME=your_username_here
|
6 |
-
REDDIT_PASSWORD=your_password_here
|
7 |
|
8 |
-
#
|
9 |
-
|
10 |
-
CLUSTERING_THRESHOLD=0.3
|
|
|
1 |
# Reddit API Credentials
|
2 |
+
# Replace these values with your own credentials from https://www.reddit.com/prefs/apps
|
3 |
+
# Do not include quotes around the values
|
4 |
+
|
5 |
+
# Your Reddit API client ID
|
6 |
REDDIT_CLIENT_ID=your_client_id_here
|
7 |
+
|
8 |
+
# Your Reddit API client secret
|
9 |
REDDIT_CLIENT_SECRET=your_client_secret_here
|
|
|
|
|
|
|
10 |
|
11 |
+
# Your Reddit API user agent (convention: <platform>:<app ID>:<version> by /u/<reddit username>)
|
12 |
+
REDDIT_USER_AGENT=RedditScraperApp/1.0
|
|
.gitignore
CHANGED
@@ -1,9 +1,14 @@
|
|
1 |
-
#
|
|
|
|
|
|
|
|
|
2 |
__pycache__/
|
3 |
*.py[cod]
|
4 |
*$py.class
|
5 |
*.so
|
6 |
.Python
|
|
|
7 |
build/
|
8 |
develop-eggs/
|
9 |
dist/
|
@@ -15,35 +20,22 @@ lib64/
|
|
15 |
parts/
|
16 |
sdist/
|
17 |
var/
|
18 |
-
wheels/
|
19 |
*.egg-info/
|
20 |
.installed.cfg
|
21 |
*.egg
|
22 |
|
23 |
-
# Virtual
|
24 |
venv/
|
25 |
-
env/
|
26 |
ENV/
|
|
|
27 |
|
28 |
-
#
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
# IDE
|
34 |
-
.idea/
|
35 |
-
.vscode/
|
36 |
-
*.swp
|
37 |
-
*.swo
|
38 |
-
|
39 |
-
# Logs
|
40 |
-
*.log
|
41 |
-
logs/
|
42 |
|
43 |
-
#
|
44 |
.DS_Store
|
45 |
Thumbs.db
|
46 |
-
|
47 |
-
# Data directories
|
48 |
-
csv/
|
49 |
-
results/
|
|
|
1 |
+
# Environment variables and credentials
|
2 |
+
.env
|
3 |
+
.streamlit/secrets.toml
|
4 |
+
|
5 |
+
# Python cache files
|
6 |
__pycache__/
|
7 |
*.py[cod]
|
8 |
*$py.class
|
9 |
*.so
|
10 |
.Python
|
11 |
+
env/
|
12 |
build/
|
13 |
develop-eggs/
|
14 |
dist/
|
|
|
20 |
parts/
|
21 |
sdist/
|
22 |
var/
|
|
|
23 |
*.egg-info/
|
24 |
.installed.cfg
|
25 |
*.egg
|
26 |
|
27 |
+
# Virtual environments
|
28 |
venv/
|
|
|
29 |
ENV/
|
30 |
+
env/
|
31 |
|
32 |
+
# Data files that might be generated by the app
|
33 |
+
*.csv
|
34 |
+
*.json
|
35 |
+
csv/
|
36 |
+
results/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
|
38 |
+
# System files
|
39 |
.DS_Store
|
40 |
Thumbs.db
|
41 |
+
.ipynb_checkpoints
|
|
|
|
|
|
.streamlit/secrets.toml.template
ADDED
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Reddit API Credentials for Hugging Face Space
|
2 |
+
# Copy this file to secrets.toml and fill in your credentials
|
3 |
+
# Or set these values in the Hugging Face Space settings under "Repository Secrets"
|
4 |
+
|
5 |
+
# Your Reddit API client ID
|
6 |
+
REDDIT_CLIENT_ID = ""
|
7 |
+
|
8 |
+
# Your Reddit API client secret
|
9 |
+
REDDIT_CLIENT_SECRET = ""
|
10 |
+
|
11 |
+
# Your Reddit API user agent
|
12 |
+
REDDIT_USER_AGENT = "RedditScraperApp/1.0"
|
README-HF.md
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Reddit Scraper
|
2 |
+
|
3 |
+

|
4 |
+
|
5 |
+
A comprehensive tool for scraping Reddit data with a user-friendly interface for data collection, analysis, and visualization.
|
6 |
+
|
7 |
+
## Features
|
8 |
+
|
9 |
+
- π **Search multiple subreddits** simultaneously
|
10 |
+
- π **Filter posts by keywords** and various criteria
|
11 |
+
- π **Visualize data** with interactive charts
|
12 |
+
- πΎ **Export results** to CSV or JSON
|
13 |
+
- π **Track search history**
|
14 |
+
- π **Secure credentials** management
|
15 |
+
|
16 |
+
## How to Use
|
17 |
+
|
18 |
+
### 1. Set up Reddit API Credentials
|
19 |
+
|
20 |
+
To use this app, you will need Reddit API credentials. You can get these from the [Reddit Developer Portal](https://www.reddit.com/prefs/apps).
|
21 |
+
|
22 |
+
- Click "Create App" or "Create Another App"
|
23 |
+
- Fill in the details (name, description)
|
24 |
+
- Select "script" as the application type
|
25 |
+
- Use "http://localhost:8000" as the redirect URI (this doesn't need to be a real endpoint)
|
26 |
+
- Click "Create app"
|
27 |
+
- Take note of the client ID (the string under "personal use script") and client secret
|
28 |
+
|
29 |
+
Enter these credentials in the app's sidebar or set them up as secrets in the Hugging Face Space settings (if you've duplicated this Space).
|
30 |
+
|
31 |
+
### 2. Searching Reddit
|
32 |
+
|
33 |
+
1. Enter one or more subreddits to search (one per line)
|
34 |
+
2. Specify keywords to search for (one per line)
|
35 |
+
3. Adjust parameters like post limit, sorting method, etc.
|
36 |
+
4. Click "Run Search" to start scraping
|
37 |
+
|
38 |
+
### 3. Working with Results
|
39 |
+
|
40 |
+
- Use the tabs to navigate between different views
|
41 |
+
- Apply additional filters to the search results
|
42 |
+
- Visualize the data with built-in charts
|
43 |
+
- Export results to CSV or JSON for further analysis
|
44 |
+
|
45 |
+
## Privacy & API Usage
|
46 |
+
|
47 |
+
This tool uses the official Reddit API and follows Reddit's API terms of service. Your API credentials are never stored on our servers unless you explicitly save them to your own copy of this Space.
|
48 |
+
|
49 |
+
## Setup Your Own Copy
|
50 |
+
|
51 |
+
If you want to run this app with your own credentials always available:
|
52 |
+
|
53 |
+
1. Duplicate this Space to your account
|
54 |
+
2. Go to Settings β Repository Secrets
|
55 |
+
3. Add the following secrets:
|
56 |
+
- `REDDIT_CLIENT_ID`: Your Reddit API client ID
|
57 |
+
- `REDDIT_CLIENT_SECRET`: Your Reddit API client secret
|
58 |
+
- `REDDIT_USER_AGENT`: (Optional) A custom user agent string
|
59 |
+
|
60 |
+
## Tech Stack
|
61 |
+
|
62 |
+
- [Streamlit](https://streamlit.io/): UI framework
|
63 |
+
- [PRAW](https://praw.readthedocs.io/): Reddit API wrapper
|
64 |
+
- [Pandas](https://pandas.pydata.org/): Data processing
|
65 |
+
- [Plotly](https://plotly.com/): Data visualization
|
66 |
+
|
67 |
+
## Feedback & Contributions
|
68 |
+
|
69 |
+
If you find any issues or have suggestions for improvements, please open an issue on the [GitHub repository](https://github.com/yourusername/reddit-scraper) or create a discussion on this Hugging Face Space.
|
advanced_scraper_ui.py
CHANGED
@@ -6,6 +6,7 @@ import time
|
|
6 |
import os
|
7 |
import json
|
8 |
from datetime import datetime
|
|
|
9 |
from enhanced_scraper import EnhancedRedditScraper
|
10 |
|
11 |
# Page configuration
|
@@ -209,13 +210,49 @@ def main():
|
|
209 |
|
210 |
# Credentials
|
211 |
with st.expander("Reddit API Credentials", expanded=not st.session_state.scraper):
|
212 |
-
|
213 |
-
|
214 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
215 |
|
216 |
if st.button("Initialize API Connection"):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
217 |
if initialize_scraper(client_id, client_secret, user_agent):
|
218 |
st.success("API connection established!")
|
|
|
|
|
|
|
|
|
219 |
|
220 |
# Search Parameters
|
221 |
st.subheader("Search Parameters")
|
@@ -476,4 +513,4 @@ def main():
|
|
476 |
st.info("No search history yet.")
|
477 |
|
478 |
if __name__ == "__main__":
|
479 |
-
main()
|
|
|
6 |
import os
|
7 |
import json
|
8 |
from datetime import datetime
|
9 |
+
from dotenv import load_dotenv
|
10 |
from enhanced_scraper import EnhancedRedditScraper
|
11 |
|
12 |
# Page configuration
|
|
|
210 |
|
211 |
# Credentials
|
212 |
with st.expander("Reddit API Credentials", expanded=not st.session_state.scraper):
|
213 |
+
st.markdown("""
|
214 |
+
### Reddit API Credentials
|
215 |
+
Please enter your Reddit API credentials below. You can obtain these from the
|
216 |
+
[Reddit Developer Portal](https://www.reddit.com/prefs/apps).
|
217 |
+
|
218 |
+
If you don't have your own credentials, you can leave these fields empty and the app
|
219 |
+
will try to use credentials from environment variables if available.
|
220 |
+
""")
|
221 |
+
|
222 |
+
# Try to load from .env file
|
223 |
+
load_dotenv()
|
224 |
+
default_client_id = os.environ.get("REDDIT_CLIENT_ID", "")
|
225 |
+
default_client_secret = os.environ.get("REDDIT_CLIENT_SECRET", "")
|
226 |
+
default_user_agent = os.environ.get("REDDIT_USER_AGENT", "RedditScraperApp/1.0")
|
227 |
+
|
228 |
+
client_id = st.text_input("Client ID", value=default_client_id)
|
229 |
+
client_secret = st.text_input("Client Secret", value=default_client_secret, type="password")
|
230 |
+
user_agent = st.text_input("User Agent", value=default_user_agent)
|
231 |
+
|
232 |
+
save_as_env = st.checkbox("Save credentials for future use (saved in .env file)", value=False)
|
233 |
|
234 |
if st.button("Initialize API Connection"):
|
235 |
+
# Save credentials if requested
|
236 |
+
if save_as_env and (client_id or client_secret):
|
237 |
+
env_vars = []
|
238 |
+
if client_id:
|
239 |
+
env_vars.append(f"REDDIT_CLIENT_ID={client_id}")
|
240 |
+
if client_secret:
|
241 |
+
env_vars.append(f"REDDIT_CLIENT_SECRET={client_secret}")
|
242 |
+
if user_agent and user_agent != "RedditScraperApp/1.0":
|
243 |
+
env_vars.append(f"REDDIT_USER_AGENT={user_agent}")
|
244 |
+
|
245 |
+
# Write to .env file
|
246 |
+
with open(".env", "w") as f:
|
247 |
+
f.write("\n".join(env_vars))
|
248 |
+
st.success("Credentials saved to .env file")
|
249 |
+
|
250 |
if initialize_scraper(client_id, client_secret, user_agent):
|
251 |
st.success("API connection established!")
|
252 |
+
# Set environment variables for the current session
|
253 |
+
os.environ["REDDIT_CLIENT_ID"] = client_id
|
254 |
+
os.environ["REDDIT_CLIENT_SECRET"] = client_secret
|
255 |
+
os.environ["REDDIT_USER_AGENT"] = user_agent
|
256 |
|
257 |
# Search Parameters
|
258 |
st.subheader("Search Parameters")
|
|
|
513 |
st.info("No search history yet.")
|
514 |
|
515 |
if __name__ == "__main__":
|
516 |
+
main()
|
app.py
ADDED
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Reddit Scraper Hugging Face Space Launcher
|
2 |
+
# This file serves as the entry point for our Hugging Face Space
|
3 |
+
|
4 |
+
import os
|
5 |
+
import streamlit as st
|
6 |
+
from dotenv import load_dotenv
|
7 |
+
|
8 |
+
# Load environment variables from .streamlit/secrets.toml if in Hugging Face Space Environment
|
9 |
+
def load_huggingface_secrets():
|
10 |
+
try:
|
11 |
+
# HF Spaces store secrets in st.secrets
|
12 |
+
client_id = st.secrets.get("REDDIT_CLIENT_ID", "")
|
13 |
+
client_secret = st.secrets.get("REDDIT_CLIENT_SECRET", "")
|
14 |
+
user_agent = st.secrets.get("REDDIT_USER_AGENT", "RedditScraperApp/1.0")
|
15 |
+
|
16 |
+
# Set as environment variables for other modules to use
|
17 |
+
if client_id:
|
18 |
+
os.environ["REDDIT_CLIENT_ID"] = client_id
|
19 |
+
if client_secret:
|
20 |
+
os.environ["REDDIT_CLIENT_SECRET"] = client_secret
|
21 |
+
if user_agent:
|
22 |
+
os.environ["REDDIT_USER_AGENT"] = user_agent
|
23 |
+
|
24 |
+
return True
|
25 |
+
except Exception:
|
26 |
+
# Fallback to regular .env file if not in HF Space
|
27 |
+
return False
|
28 |
+
|
29 |
+
# Try to load secrets (first from HF secrets, then from .env)
|
30 |
+
load_huggingface_secrets()
|
31 |
+
load_dotenv()
|
32 |
+
|
33 |
+
# Import the main app after environment setup to ensure it has access to variables
|
34 |
+
from advanced_scraper_ui import main
|
35 |
+
|
36 |
+
# Run the main app
|
37 |
+
if __name__ == "__main__":
|
38 |
+
main()
|
enhanced_scraper.py
CHANGED
@@ -4,7 +4,9 @@ import datetime
|
|
4 |
import re
|
5 |
import json
|
6 |
import os
|
|
|
7 |
from typing import List, Dict, Any, Optional
|
|
|
8 |
|
9 |
class EnhancedRedditScraper:
|
10 |
"""
|
@@ -194,26 +196,44 @@ class EnhancedRedditScraper:
|
|
194 |
|
195 |
# Example usage
|
196 |
if __name__ == "__main__":
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
197 |
# Create the scraper instance
|
198 |
scraper = EnhancedRedditScraper(
|
199 |
-
client_id=
|
200 |
-
client_secret=
|
201 |
-
user_agent=
|
202 |
)
|
203 |
|
204 |
# Simple example
|
205 |
-
|
206 |
-
|
207 |
-
|
208 |
-
|
209 |
-
|
210 |
-
|
211 |
-
|
212 |
-
|
213 |
-
|
214 |
-
|
215 |
-
|
216 |
-
|
217 |
-
|
218 |
-
|
219 |
-
|
|
|
|
|
|
|
|
|
|
4 |
import re
|
5 |
import json
|
6 |
import os
|
7 |
+
import os.path
|
8 |
from typing import List, Dict, Any, Optional
|
9 |
+
from dotenv import load_dotenv
|
10 |
|
11 |
class EnhancedRedditScraper:
|
12 |
"""
|
|
|
196 |
|
197 |
# Example usage
|
198 |
if __name__ == "__main__":
|
199 |
+
# Load environment variables from .env file
|
200 |
+
load_dotenv()
|
201 |
+
|
202 |
+
# Get credentials from environment variables or use defaults for development
|
203 |
+
client_id = os.environ.get("REDDIT_CLIENT_ID", "")
|
204 |
+
client_secret = os.environ.get("REDDIT_CLIENT_SECRET", "")
|
205 |
+
user_agent = os.environ.get("REDDIT_USER_AGENT", "RedditScraperApp/1.0")
|
206 |
+
|
207 |
+
if not client_id or not client_secret:
|
208 |
+
print("Warning: Reddit API credentials not found in environment variables.")
|
209 |
+
print("Please set REDDIT_CLIENT_ID and REDDIT_CLIENT_SECRET in .env file")
|
210 |
+
print("or as environment variables for proper functionality.")
|
211 |
+
# For development only, you could set default credentials here
|
212 |
+
|
213 |
# Create the scraper instance
|
214 |
scraper = EnhancedRedditScraper(
|
215 |
+
client_id=client_id,
|
216 |
+
client_secret=client_secret,
|
217 |
+
user_agent=user_agent
|
218 |
)
|
219 |
|
220 |
# Simple example
|
221 |
+
try:
|
222 |
+
results = scraper.scrape_subreddit(
|
223 |
+
subreddit_name="cuny",
|
224 |
+
keywords=["question", "help", "confused"],
|
225 |
+
limit=25,
|
226 |
+
sort_by="hot",
|
227 |
+
include_comments=True
|
228 |
+
)
|
229 |
+
|
230 |
+
print(f"Found {len(results)} matching posts")
|
231 |
+
|
232 |
+
# Save results to file
|
233 |
+
if results:
|
234 |
+
csv_path = scraper.save_results_to_csv("reddit_results")
|
235 |
+
json_path = scraper.save_results_to_json("reddit_results")
|
236 |
+
print(f"Results saved to {csv_path} and {json_path}")
|
237 |
+
except Exception as e:
|
238 |
+
print(f"Error: {str(e)}")
|
239 |
+
print("This may be due to missing or invalid API credentials.")
|
packages.txt
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# System dependencies for the Reddit Scraper app
|
2 |
+
# This file is used by Hugging Face Spaces to install system packages
|
3 |
+
|
4 |
+
# No additional system packages are required for this app
|
requirements.txt
CHANGED
@@ -3,3 +3,4 @@ pandas>=1.3.0
|
|
3 |
streamlit>=1.3.0
|
4 |
plotly>=5.5.0
|
5 |
matplotlib>=3.5.0
|
|
|
|
3 |
streamlit>=1.3.0
|
4 |
plotly>=5.5.0
|
5 |
matplotlib>=3.5.0
|
6 |
+
python-dotenv>=0.20.0
|
setup_for_hf.sh
ADDED
@@ -0,0 +1,107 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/bash
|
2 |
+
|
3 |
+
# Setup script for pushing Reddit Scraper to Hugging Face
|
4 |
+
|
5 |
+
echo "==== Reddit Scraper: Hugging Face Setup ===="
|
6 |
+
echo ""
|
7 |
+
|
8 |
+
# Check for required tools
|
9 |
+
echo "Checking for required tools..."
|
10 |
+
|
11 |
+
if ! command -v git &> /dev/null; then
|
12 |
+
echo "β Git not found. Please install Git before continuing."
|
13 |
+
exit 1
|
14 |
+
else
|
15 |
+
echo "β
Git installed"
|
16 |
+
fi
|
17 |
+
|
18 |
+
if ! command -v python3 &> /dev/null; then
|
19 |
+
echo "β Python 3 not found. Please install Python 3 before continuing."
|
20 |
+
exit 1
|
21 |
+
else
|
22 |
+
echo "β
Python 3 installed"
|
23 |
+
fi
|
24 |
+
|
25 |
+
if ! command -v pip3 &> /dev/null; then
|
26 |
+
echo "β pip not found. Please install pip before continuing."
|
27 |
+
exit 1
|
28 |
+
else
|
29 |
+
echo "β
pip installed"
|
30 |
+
fi
|
31 |
+
|
32 |
+
if ! command -v huggingface-cli &> /dev/null; then
|
33 |
+
echo "β οΈ Hugging Face CLI not installed. Installing now..."
|
34 |
+
pip install huggingface_hub
|
35 |
+
if ! command -v huggingface-cli &> /dev/null; then
|
36 |
+
echo "β Failed to install Hugging Face CLI. Please install manually: pip install huggingface_hub"
|
37 |
+
exit 1
|
38 |
+
else
|
39 |
+
echo "β
Hugging Face CLI installed"
|
40 |
+
fi
|
41 |
+
else
|
42 |
+
echo "β
Hugging Face CLI installed"
|
43 |
+
fi
|
44 |
+
|
45 |
+
echo ""
|
46 |
+
echo "Verifying project files..."
|
47 |
+
|
48 |
+
# Check for required files
|
49 |
+
required_files=("app.py" "requirements.txt" "enhanced_scraper.py" "advanced_scraper_ui.py" "README-HF.md")
|
50 |
+
missing_files=0
|
51 |
+
|
52 |
+
for file in "${required_files[@]}"; do
|
53 |
+
if [ ! -f "$file" ]; then
|
54 |
+
echo "β Missing required file: $file"
|
55 |
+
missing_files=$((missing_files+1))
|
56 |
+
else
|
57 |
+
echo "β
Found $file"
|
58 |
+
fi
|
59 |
+
done
|
60 |
+
|
61 |
+
if [ $missing_files -gt 0 ]; then
|
62 |
+
echo ""
|
63 |
+
echo "β Some required files are missing. Please make sure all project files exist."
|
64 |
+
exit 1
|
65 |
+
fi
|
66 |
+
|
67 |
+
echo ""
|
68 |
+
echo "All required files are present."
|
69 |
+
echo ""
|
70 |
+
|
71 |
+
# Check for Hugging Face login
|
72 |
+
echo "Checking Hugging Face login status..."
|
73 |
+
huggingface-cli whoami &> /dev/null
|
74 |
+
if [ $? -ne 0 ]; then
|
75 |
+
echo "You need to login to Hugging Face first."
|
76 |
+
echo "Run the following command and follow the instructions:"
|
77 |
+
echo ""
|
78 |
+
echo "huggingface-cli login"
|
79 |
+
echo ""
|
80 |
+
exit 1
|
81 |
+
else
|
82 |
+
echo "β
Already logged in to Hugging Face"
|
83 |
+
fi
|
84 |
+
|
85 |
+
echo ""
|
86 |
+
echo "==== Ready to push to Hugging Face! ===="
|
87 |
+
echo ""
|
88 |
+
echo "To create a new Hugging Face Space and push your code:"
|
89 |
+
echo ""
|
90 |
+
echo "1. Go to https://huggingface.co/new-space"
|
91 |
+
echo "2. Choose a Space name (e.g., 'reddit-scraper')"
|
92 |
+
echo "3. Select 'Streamlit' as the SDK"
|
93 |
+
echo "4. Create the Space"
|
94 |
+
echo ""
|
95 |
+
echo "Then run the following commands to push your code:"
|
96 |
+
echo ""
|
97 |
+
echo "git init"
|
98 |
+
echo "git add ."
|
99 |
+
echo "git commit -m \"Initial commit of Reddit Scraper\""
|
100 |
+
echo "git branch -M main"
|
101 |
+
echo "git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/reddit-scraper"
|
102 |
+
echo "git push -u origin main"
|
103 |
+
echo ""
|
104 |
+
echo "Replace YOUR_USERNAME with your Hugging Face username."
|
105 |
+
echo ""
|
106 |
+
echo "Remember to set up your Reddit API credentials in the Space settings!"
|
107 |
+
echo ""
|