feat(openinsider): 新增 OpenInsider 內部人交易爬蟲,支援多標的與每日排程
- 新增 app/crawlers/openinsider.py,來源 http://openinsider.com/search?q={symbol} - 支援多標的:以 SYMBOLS=PLTR,NVDA,... 同時追多檔(或使用 SYMBOL 單一) - runner: 多實例排程與啟動;/check 會依序觸發全部爬蟲 - API: /info、/stats、/check、/notify_test 支援多爬蟲回應 - config/base: 新增 RUN_DAILY_AT 每日固定時間;未設定則用 CHECK_INTERVAL - notifications: 新增 send_custom_email、send_text_webhook、send_text_discord - README 與 .env.template 更新;.env 改為 CRAWLER_TYPE=openinsider - 移除 quiver_insiders 爬蟲與相關設定 BREAKING CHANGE: 不再支援 CRAWLER_TYPE=quiver_insiders;請改用 openinsider。
This commit is contained in:
@@ -1,5 +1,6 @@
|
||||
# 基本設定
|
||||
CHECK_INTERVAL=300
|
||||
RUN_DAILY_AT=12:00
|
||||
LOG_LEVEL=INFO
|
||||
ALWAYS_NOTIFY_ON_STARTUP=false
|
||||
|
||||
@@ -23,3 +24,11 @@ DISCORD_WEBHOOK=https://discord.com/api/webhooks/YOUR/DISCORD/WEBHOOK
|
||||
# 預設 Docker 會使用 /app/data、/app/logs;本機則使用 ./data、./logs
|
||||
# DATA_DIR=./data
|
||||
# LOG_DIR=./logs
|
||||
|
||||
# 選擇爬蟲類型與參數
|
||||
# 可選: barrons | openinsider
|
||||
CRAWLER_TYPE=openinsider
|
||||
# 針對內部人交易爬蟲的股票代號(單一)
|
||||
SYMBOL=PLTR
|
||||
# 或一次追多個:以逗號分隔
|
||||
# SYMBOLS=PLTR,NVDA,TSLA
|
||||
|
39
README.md
39
README.md
@@ -1,14 +1,16 @@
|
||||
# Barron's 股票推薦爬蟲(模組化架構)
|
||||
# 股票爬蟲服務(模組化架構)
|
||||
|
||||
一個可擴充的爬蟲服務,內建 HTTP API 與多種通知(Email/Webhook/Discord)。
|
||||
現已模組化:API 與爬蟲核心分離,便於未來新增其他網站的爬蟲。
|
||||
可擴充的股票爬蟲服務,內建 HTTP API 與多種通知(Email/Webhook/Discord)。
|
||||
目前提供兩類爬蟲:
|
||||
- Barron's 股票推薦
|
||||
- OpenInsider 內部人交易(支援多標的)
|
||||
|
||||
## 功能
|
||||
- 定時抓取 Barron's 股票推薦頁面
|
||||
- 定時抓取(支援每 N 秒或每日固定時間)
|
||||
- 只在有新內容時發送通知(可設定首次啟動也通知)
|
||||
- 內建 `/health`、`/stats`、`/check` 與 `/notify_test` API
|
||||
- 內建 `/health`、`/info`、`/stats`、`/check` 與 `/notify_test` API
|
||||
- Docker 化部署,資料與日誌可持久化
|
||||
- 架構模組化,易於擴充其他站點
|
||||
- 架構模組化,日後可擴充其他站點
|
||||
|
||||
## 專案結構
|
||||
```
|
||||
@@ -18,6 +20,7 @@ app/
|
||||
api/server.py # Flask API
|
||||
crawlers/base.py # BaseCrawler:通用排程/比對/通知
|
||||
crawlers/barrons.py # Barron’s 爬蟲
|
||||
crawlers/openinsider.py # OpenInsider 內部人交易爬蟲(多標的)
|
||||
crawlers/template.py # 新站點範本(複製後改名擴充)
|
||||
services/storage.py # JSON 儲存
|
||||
services/notifications.py # Email/Webhook/Discord
|
||||
@@ -60,7 +63,8 @@ python enhanced_crawler.py
|
||||
|
||||
## 環境變數說明
|
||||
- 基本
|
||||
- `CHECK_INTERVAL`: 檢查間隔(秒),預設 300
|
||||
- `CHECK_INTERVAL`: 檢查間隔(秒),預設 300(若設定了 `RUN_DAILY_AT` 則忽略)
|
||||
- `RUN_DAILY_AT`: 每天固定時間(例如 `12:00`),使用容器本機時區
|
||||
- `LOG_LEVEL`: 日誌等級,預設 `INFO`(可 `DEBUG`)
|
||||
- `ALWAYS_NOTIFY_ON_STARTUP`: 是否在啟動後第一次就寄出目前清單(true/false),預設 false
|
||||
- Email(可選)
|
||||
@@ -75,14 +79,22 @@ python enhanced_crawler.py
|
||||
- `DATA_DIR`: 資料輸出路徑(Docker 預設 `/app/data`;本機預設 `./data`)
|
||||
- `LOG_DIR`: 日誌輸出路徑(Docker 預設 `/app/logs`;本機預設 `./logs`)
|
||||
|
||||
- 爬蟲選擇與參數
|
||||
- `CRAWLER_TYPE`: `barrons` | `openinsider`
|
||||
- Barron's:無額外參數
|
||||
- OpenInsider:
|
||||
- 單一標的:`SYMBOL=PLTR`
|
||||
- 多個標的:`SYMBOLS=PLTR,NVDA,TSLA`
|
||||
|
||||
Email 使用建議:
|
||||
- Gmail 請使用「應用程式密碼」並開啟兩步驟驗證
|
||||
- 校園/企業信箱請向管理者確認 SMTP 主機、連接埠與加密方式
|
||||
|
||||
## Web API 端點
|
||||
- `GET /health`: 健康檢查
|
||||
- `GET /stats`: 目前統計資訊(啟動時間、檢查次數、錯誤數…)
|
||||
- `GET /check`: 立即執行一次檢查
|
||||
- `GET /info`: 當前爬蟲資訊(多實例時回傳陣列)
|
||||
- `GET /stats`: 目前統計資訊(單實例為物件,多實例為 map)
|
||||
- `GET /check`: 立即執行一次檢查(多實例會對每個爬蟲都執行)
|
||||
- `GET /notify_test?channel=email|webhook|discord`: 測試通知
|
||||
|
||||
## 健康檢查與維運
|
||||
@@ -102,6 +114,7 @@ docker-compose down
|
||||
- `last_update`: ISO 時間
|
||||
- `stock_picks`: 文章清單(title/link/hash/scraped_at)
|
||||
- `stats`: 執行統計
|
||||
- OpenInsider 多標的:`data/openinsider_<SYMBOL>.json`
|
||||
|
||||
## 擴充新站點(建議流程)
|
||||
1) 複製範本:`app/crawlers/template.py` → `app/crawlers/<your_site>.py`
|
||||
@@ -115,7 +128,8 @@ docker-compose down
|
||||
## 故障排除
|
||||
- 取不到網頁:檢查網路、User-Agent、目標網站是否改版
|
||||
- Email 失敗:確認 SMTP 設定、應用程式密碼、連接埠與加密方式
|
||||
- 解析不到內容:查看日誌,更新選擇器邏輯
|
||||
- Barron's 解析不到內容:查看日誌,更新選擇器邏輯
|
||||
- OpenInsider 解析不到內容:檢查 `SYMBOL/SYMBOLS` 是否正確,觀察是否被站方限流
|
||||
- 服務無回應:檢查容器日誌與健康檢查狀態
|
||||
|
||||
## 安全建議
|
||||
@@ -124,5 +138,6 @@ docker-compose down
|
||||
- 若對外開放 API,建議加上認證與 HTTPS
|
||||
|
||||
## 版本記事
|
||||
- 2025-09:重構為模組化架構,API 與爬蟲邏輯分離,新增擴充範本
|
||||
|
||||
- 2025-09:
|
||||
- 重構為模組化架構,API 與爬蟲邏輯分離
|
||||
- 新增 OpenInsider 內部人交易爬蟲與多標的支援
|
||||
|
@@ -8,6 +8,10 @@ from app.services import notifications as notif
|
||||
|
||||
def create_app(crawler) -> Flask:
|
||||
app = Flask(__name__)
|
||||
# Support single crawler or a list of crawlers
|
||||
crawlers = None
|
||||
if isinstance(crawler, (list, tuple)):
|
||||
crawlers = list(crawler)
|
||||
|
||||
@app.get('/health')
|
||||
def health():
|
||||
@@ -15,12 +19,47 @@ def create_app(crawler) -> Flask:
|
||||
|
||||
@app.get('/stats')
|
||||
def stats():
|
||||
if crawlers is not None:
|
||||
return jsonify({
|
||||
(getattr(c, 'symbol', getattr(c, 'name', f"crawler_{i}")) or f"crawler_{i}"):
|
||||
c.stats for i, c in enumerate(crawlers)
|
||||
})
|
||||
if crawler:
|
||||
return jsonify(crawler.stats)
|
||||
return jsonify({"error": "Crawler not initialized"}), 500
|
||||
|
||||
@app.get('/info')
|
||||
def info():
|
||||
if crawlers is not None:
|
||||
out = []
|
||||
for c in crawlers:
|
||||
out.append({
|
||||
"name": getattr(c, 'name', 'unknown'),
|
||||
"type": c.__class__.__name__,
|
||||
"symbol": getattr(c, 'symbol', None),
|
||||
"schedule": getattr(c.config, 'run_daily_at', None) or f"every {c.config.check_interval}s",
|
||||
})
|
||||
return jsonify(out)
|
||||
if not crawler:
|
||||
return jsonify({"error": "Crawler not initialized"}), 500
|
||||
return jsonify({
|
||||
"name": getattr(crawler, 'name', 'unknown'),
|
||||
"type": crawler.__class__.__name__,
|
||||
"symbol": getattr(crawler, 'symbol', None),
|
||||
"schedule": getattr(crawler.config, 'run_daily_at', None) or f"every {crawler.config.check_interval}s",
|
||||
})
|
||||
|
||||
@app.get('/check')
|
||||
def manual_check():
|
||||
if crawlers is not None:
|
||||
results = []
|
||||
for c in crawlers:
|
||||
r = c.run_check() or []
|
||||
results.append({
|
||||
"symbol": getattr(c, 'symbol', None),
|
||||
"new": len(r)
|
||||
})
|
||||
return jsonify({"results": results})
|
||||
if not crawler:
|
||||
return jsonify({"error": "Crawler not initialized"}), 500
|
||||
result = crawler.run_check() or []
|
||||
@@ -28,29 +67,49 @@ def create_app(crawler) -> Flask:
|
||||
|
||||
@app.get('/notify_test')
|
||||
def notify_test():
|
||||
channel = (request.args.get('channel') or 'email').lower()
|
||||
target = request.args.get('target')
|
||||
test_pick = [notif.build_test_pick()]
|
||||
|
||||
def _send_for(c):
|
||||
if channel == 'email':
|
||||
if not c.config.email:
|
||||
return {"error": "Email config not set"}
|
||||
notif.send_email(test_pick, c.config.email)
|
||||
elif channel == 'webhook':
|
||||
if not c.config.webhook_url:
|
||||
return {"error": "Webhook URL not set"}
|
||||
notif.send_webhook(test_pick, c.config.webhook_url)
|
||||
elif channel == 'discord':
|
||||
if not c.config.discord_webhook:
|
||||
return {"error": "Discord webhook not set"}
|
||||
notif.send_discord(test_pick, c.config.discord_webhook)
|
||||
else:
|
||||
return {"error": f"Unsupported channel: {channel}"}
|
||||
return {"result": f"Test notification sent via {channel}"}
|
||||
|
||||
if crawlers is not None:
|
||||
results = {}
|
||||
for c in crawlers:
|
||||
key = getattr(c, 'symbol', getattr(c, 'name', 'unknown'))
|
||||
if target and key != target:
|
||||
continue
|
||||
try:
|
||||
results[key] = _send_for(c)
|
||||
except Exception as e:
|
||||
c.logger.error(f"測試通知發送失敗({key}): {e}")
|
||||
results[key] = {"error": str(e)}
|
||||
return jsonify(results)
|
||||
|
||||
if not crawler:
|
||||
return jsonify({"error": "Crawler not initialized"}), 500
|
||||
channel = (request.args.get('channel') or 'email').lower()
|
||||
test_pick = [notif.build_test_pick()]
|
||||
try:
|
||||
if channel == 'email':
|
||||
if not crawler.config.email:
|
||||
return jsonify({"error": "Email config not set"}), 400
|
||||
notif.send_email(test_pick, crawler.config.email)
|
||||
elif channel == 'webhook':
|
||||
if not crawler.config.webhook_url:
|
||||
return jsonify({"error": "Webhook URL not set"}), 400
|
||||
notif.send_webhook(test_pick, crawler.config.webhook_url)
|
||||
elif channel == 'discord':
|
||||
if not crawler.config.discord_webhook:
|
||||
return jsonify({"error": "Discord webhook not set"}), 400
|
||||
notif.send_discord(test_pick, crawler.config.discord_webhook)
|
||||
else:
|
||||
return jsonify({"error": f"Unsupported channel: {channel}"}), 400
|
||||
return jsonify({"result": f"Test notification sent via {channel}"})
|
||||
res = _send_for(crawler)
|
||||
if 'error' in res:
|
||||
return jsonify(res), 400
|
||||
return jsonify(res)
|
||||
except Exception as e:
|
||||
crawler.logger.error(f"測試通知發送失敗: {e}")
|
||||
return jsonify({"error": str(e)}), 500
|
||||
|
||||
return app
|
||||
|
||||
|
@@ -24,6 +24,7 @@ class AppConfig:
|
||||
data_dir: str
|
||||
log_dir: str
|
||||
email: EmailConfig | None
|
||||
run_daily_at: str | None
|
||||
|
||||
|
||||
def _resolve_dir(env_key: str, default_subdir: str) -> str:
|
||||
@@ -82,6 +83,7 @@ def load_config() -> AppConfig:
|
||||
discord_webhook = os.getenv('DISCORD_WEBHOOK')
|
||||
data_dir = _resolve_dir('DATA_DIR', 'data')
|
||||
log_dir = _resolve_dir('LOG_DIR', 'logs')
|
||||
run_daily_at = os.getenv('RUN_DAILY_AT') # e.g., "12:00"
|
||||
|
||||
return AppConfig(
|
||||
check_interval=check_interval,
|
||||
@@ -92,5 +94,5 @@ def load_config() -> AppConfig:
|
||||
data_dir=data_dir,
|
||||
log_dir=log_dir,
|
||||
email=load_email_config(),
|
||||
run_daily_at=run_daily_at,
|
||||
)
|
||||
|
||||
|
@@ -125,12 +125,15 @@ class BaseCrawler(ABC):
|
||||
signal.signal(signal.SIGINT, self._signal_handler)
|
||||
signal.signal(signal.SIGTERM, self._signal_handler)
|
||||
|
||||
schedule.every(self.config.check_interval).seconds.do(self.run_check)
|
||||
self.logger.info(f"🚀 爬蟲已啟動,每 {self.config.check_interval} 秒檢查一次")
|
||||
if getattr(self.config, 'run_daily_at', None):
|
||||
schedule.every().day.at(self.config.run_daily_at).do(self.run_check)
|
||||
self.logger.info(f"🚀 爬蟲已啟動,每天 {self.config.run_daily_at} 檢查一次")
|
||||
else:
|
||||
schedule.every(self.config.check_interval).seconds.do(self.run_check)
|
||||
self.logger.info(f"🚀 爬蟲已啟動,每 {self.config.check_interval} 秒檢查一次")
|
||||
self.run_check()
|
||||
self._first_check_done = True
|
||||
while self.running:
|
||||
schedule.run_pending()
|
||||
time.sleep(1)
|
||||
self.logger.info("爬蟲已停止")
|
||||
|
||||
|
162
app/crawlers/openinsider.py
Normal file
162
app/crawlers/openinsider.py
Normal file
@@ -0,0 +1,162 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
from datetime import datetime
|
||||
from typing import List, Dict, Optional
|
||||
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
from app.crawlers.base import BaseCrawler
|
||||
from app.services import notifications as notif
|
||||
|
||||
|
||||
class OpenInsiderCrawler(BaseCrawler):
|
||||
"""Crawler for OpenInsider search results.
|
||||
|
||||
Source: http://openinsider.com/search?q={symbol}
|
||||
Parses the HTML table and emits insider transactions.
|
||||
"""
|
||||
|
||||
def __init__(self, config, logger, symbol: str = "PLTR"):
|
||||
super().__init__(
|
||||
name=f"OpenInsider 內部人交易:{symbol}",
|
||||
config=config,
|
||||
logger=logger,
|
||||
data_filename=f"openinsider_{symbol}.json",
|
||||
)
|
||||
self.symbol = symbol.upper()
|
||||
self.url = f"http://openinsider.com/search?q={self.symbol}"
|
||||
self.headers = {
|
||||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
|
||||
'AppleWebKit/537.36 (KHTML, like Gecko) '
|
||||
'Chrome/114.0 Safari/537.36'
|
||||
}
|
||||
|
||||
def fetch_page(self) -> Optional[str]:
|
||||
try:
|
||||
resp = requests.get(self.url, headers=self.headers, timeout=30)
|
||||
resp.raise_for_status()
|
||||
return resp.text
|
||||
except requests.RequestException as e:
|
||||
self.logger.error(f"獲取 OpenInsider 頁面失敗: {e}")
|
||||
self.stats['errors'] += 1
|
||||
return None
|
||||
|
||||
def parse_items(self, html_content: str) -> List[Dict]:
|
||||
soup = BeautifulSoup(html_content, 'html.parser')
|
||||
|
||||
# Find the main results table by looking for expected headers
|
||||
best_table = None
|
||||
candidate_tables = soup.find_all('table')
|
||||
self.logger.info(f"OpenInsider:發現 {len(candidate_tables)} 個 <table>")
|
||||
expected_headers = {'insider', 'insider name', 'ticker', 'trans type', 'transaction', 'trade date', 'filing date'}
|
||||
for tbl in candidate_tables:
|
||||
headers = [th.get_text(strip=True).lower() for th in tbl.find_all('th')]
|
||||
if not headers:
|
||||
continue
|
||||
hset = set(headers)
|
||||
if any(h in hset for h in expected_headers):
|
||||
best_table = tbl
|
||||
break
|
||||
if not best_table and candidate_tables:
|
||||
best_table = candidate_tables[0]
|
||||
|
||||
if not best_table:
|
||||
self.logger.warning("OpenInsider:找不到結果表格")
|
||||
return []
|
||||
|
||||
# Build header index map (robust match)
|
||||
header_map: Dict[str, int] = {}
|
||||
header_texts = [th.get_text(strip=True).lower() for th in best_table.find_all('th')]
|
||||
for idx, text in enumerate(header_texts):
|
||||
header_map[text] = idx
|
||||
|
||||
def find_idx(possible: List[str]) -> Optional[int]:
|
||||
for key in possible:
|
||||
if key in header_map:
|
||||
return header_map[key]
|
||||
# fuzzy contains
|
||||
for k, v in header_map.items():
|
||||
if any(p in k for p in possible):
|
||||
return v
|
||||
return None
|
||||
|
||||
idx_insider = find_idx(['insider name', 'insider', 'name'])
|
||||
idx_type = find_idx(['trans type', 'transaction', 'type'])
|
||||
idx_qty = find_idx(['qty', 'quantity', 'shares'])
|
||||
idx_price = find_idx(['price'])
|
||||
idx_ticker = find_idx(['ticker'])
|
||||
idx_trade_date = find_idx(['trade date', 'date'])
|
||||
idx_filing_date = find_idx(['filing date', 'filed'])
|
||||
|
||||
rows = best_table.find_all('tr')
|
||||
# Skip header rows (those that contain th)
|
||||
data_rows = [r for r in rows if r.find('td')]
|
||||
|
||||
items: List[Dict] = []
|
||||
for row in data_rows[:100]:
|
||||
cols = row.find_all('td')
|
||||
def col_text(i: Optional[int]) -> str:
|
||||
if i is None or i >= len(cols):
|
||||
return ''
|
||||
return cols[i].get_text(strip=True)
|
||||
|
||||
insider = col_text(idx_insider) or 'Unknown Insider'
|
||||
trans_type = col_text(idx_type) or 'N/A'
|
||||
qty = col_text(idx_qty) or 'N/A'
|
||||
price = col_text(idx_price) or 'N/A'
|
||||
ticker = (col_text(idx_ticker) or '').upper()
|
||||
trade_date = col_text(idx_trade_date)
|
||||
filing_date = col_text(idx_filing_date)
|
||||
|
||||
if ticker and self.symbol not in ticker:
|
||||
# Keep results aligned to symbol query
|
||||
continue
|
||||
|
||||
title = f"{self.symbol} {trans_type} - {insider} qty {qty} @ {price} on {trade_date}"
|
||||
if filing_date:
|
||||
title += f" (filed {filing_date})"
|
||||
hash_src = f"{self.symbol}|{insider}|{trans_type}|{qty}|{price}|{trade_date}|{filing_date}"
|
||||
items.append({
|
||||
'title': title,
|
||||
'link': self.url,
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
'hash': hashlib.md5(hash_src.encode('utf-8')).hexdigest()[:12],
|
||||
})
|
||||
|
||||
self.logger.info(f"OpenInsider:解析完成,擷取 {len(items)} 筆交易")
|
||||
return items
|
||||
|
||||
def _send_notifications(self, items: List[Dict]) -> None:
|
||||
subject = f"OpenInsider 內部人交易異動 - {self.symbol} ({len(items)}筆)"
|
||||
lines = []
|
||||
for it in items[:10]:
|
||||
lines.append(f"• {it['title']}")
|
||||
body = (
|
||||
f"發現 {len(items)} 筆新的內部人交易異動(OpenInsider):\n\n" + "\n".join(lines) + "\n\n"
|
||||
f"抓取時間:{datetime.now().isoformat()}\n來源:{self.url}"
|
||||
)
|
||||
|
||||
sent = False
|
||||
if self.config.email:
|
||||
try:
|
||||
notif.send_custom_email(subject, body, self.config.email)
|
||||
sent = True
|
||||
except Exception as e:
|
||||
self.logger.error(f"電子郵件通知失敗: {e}")
|
||||
if self.config.webhook_url:
|
||||
try:
|
||||
notif.send_text_webhook(subject + "\n\n" + body, self.config.webhook_url)
|
||||
sent = True
|
||||
except Exception as e:
|
||||
self.logger.error(f"Webhook 通知失敗: {e}")
|
||||
if self.config.discord_webhook:
|
||||
try:
|
||||
notif.send_text_discord(title=subject, description=f"{self.symbol} 內部人交易更新(OpenInsider)", lines=lines[:10], webhook=self.config.discord_webhook)
|
||||
sent = True
|
||||
except Exception as e:
|
||||
self.logger.error(f"Discord 通知失敗: {e}")
|
||||
if sent:
|
||||
self.stats['last_notification'] = datetime.now().isoformat()
|
||||
|
@@ -1,9 +1,13 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import threading
|
||||
import time
|
||||
import schedule
|
||||
|
||||
from app.config import load_config, setup_logging
|
||||
from app.crawlers.barrons import BarronsCrawler
|
||||
from app.crawlers.openinsider import OpenInsiderCrawler
|
||||
from app.api.server import create_app
|
||||
|
||||
|
||||
@@ -12,11 +16,21 @@ def start():
|
||||
config = load_config()
|
||||
logger = setup_logging(config.log_level, config.log_dir)
|
||||
|
||||
# Create crawler instance
|
||||
crawler = BarronsCrawler(config, logger)
|
||||
# Select crawler via env var
|
||||
crawler_type = (os.getenv('CRAWLER_TYPE') or 'barrons').lower()
|
||||
crawlers = []
|
||||
if crawler_type in ('openinsider', 'open_insider'):
|
||||
symbols_raw = os.getenv('SYMBOLS') or os.getenv('SYMBOL', 'PLTR')
|
||||
symbols = [s.strip().upper() for s in symbols_raw.split(',') if s.strip()]
|
||||
logger.info(f"使用 OpenInsider 內部人交易爬蟲,symbols={symbols}")
|
||||
for sym in symbols:
|
||||
crawlers.append(OpenInsiderCrawler(config, logger, symbol=sym))
|
||||
else:
|
||||
logger.info("使用 Barron's 股票推薦爬蟲")
|
||||
crawlers.append(BarronsCrawler(config, logger))
|
||||
|
||||
# Create and start API in background
|
||||
app = create_app(crawler)
|
||||
app = create_app(crawlers if len(crawlers) > 1 else crawlers[0])
|
||||
|
||||
def run_api():
|
||||
app.run(host='0.0.0.0', port=8080, debug=False)
|
||||
@@ -24,6 +38,29 @@ def start():
|
||||
flask_thread = threading.Thread(target=run_api, daemon=True)
|
||||
flask_thread.start()
|
||||
|
||||
# Run crawler loop (blocking)
|
||||
crawler.run()
|
||||
# Schedule checks for each crawler and run loop (blocking)
|
||||
if getattr(config, 'run_daily_at', None):
|
||||
for c in crawlers:
|
||||
schedule.every().day.at(config.run_daily_at).do(c.run_check)
|
||||
logger.info(f"🚀 多爬蟲已啟動,每天 {config.run_daily_at} 檢查一次:{[getattr(c, 'symbol', c.name) for c in crawlers]}")
|
||||
else:
|
||||
for c in crawlers:
|
||||
schedule.every(config.check_interval).seconds.do(c.run_check)
|
||||
logger.info(f"🚀 多爬蟲已啟動,每 {config.check_interval} 秒檢查一次:{[getattr(c, 'symbol', c.name) for c in crawlers]}")
|
||||
|
||||
# Initial run for each
|
||||
for c in crawlers:
|
||||
c.run_check()
|
||||
# Mark first check done to respect ALWAYS_NOTIFY_ON_STARTUP logic afterwards
|
||||
try:
|
||||
c._first_check_done = True
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Main loop
|
||||
try:
|
||||
while True:
|
||||
schedule.run_pending()
|
||||
time.sleep(1)
|
||||
except KeyboardInterrupt:
|
||||
logger.info("收到停止信號,正在關閉…")
|
||||
|
@@ -42,6 +42,27 @@ def send_email(new_picks: List[Dict], cfg: EmailConfig) -> None:
|
||||
server.quit()
|
||||
|
||||
|
||||
def send_custom_email(subject: str, body: str, cfg: EmailConfig) -> None:
|
||||
msg = MIMEMultipart()
|
||||
msg['From'] = cfg.from_email
|
||||
msg['To'] = cfg.to_email
|
||||
msg['Subject'] = subject
|
||||
msg.attach(MIMEText(body, 'plain', 'utf-8'))
|
||||
|
||||
if cfg.smtp_security == 'ssl':
|
||||
server = smtplib.SMTP_SSL(cfg.smtp_server, cfg.smtp_port)
|
||||
else:
|
||||
server = smtplib.SMTP(cfg.smtp_server, cfg.smtp_port)
|
||||
server.ehlo()
|
||||
if cfg.smtp_security == 'starttls':
|
||||
server.starttls()
|
||||
server.ehlo()
|
||||
|
||||
server.login(cfg.username, cfg.password)
|
||||
server.send_message(msg)
|
||||
server.quit()
|
||||
|
||||
|
||||
def send_webhook(new_picks: List[Dict], url: str) -> None:
|
||||
message = f"🚨 發現 {len(new_picks)} 條新的 Barron's 股票推薦!\n\n"
|
||||
for pick in new_picks[:5]:
|
||||
@@ -53,6 +74,11 @@ def send_webhook(new_picks: List[Dict], url: str) -> None:
|
||||
requests.post(url, json=payload, timeout=10)
|
||||
|
||||
|
||||
def send_text_webhook(message: str, url: str) -> None:
|
||||
payload = {"text": message}
|
||||
requests.post(url, json=payload, timeout=10)
|
||||
|
||||
|
||||
def send_discord(new_picks: List[Dict], webhook: str) -> None:
|
||||
embed = {
|
||||
"title": "📈 Barron's 新股票推薦",
|
||||
@@ -69,6 +95,22 @@ def send_discord(new_picks: List[Dict], webhook: str) -> None:
|
||||
requests.post(webhook, json={"embeds": [embed]}, timeout=10)
|
||||
|
||||
|
||||
def send_text_discord(title: str, description: str, lines: List[str], webhook: str) -> None:
|
||||
embed = {
|
||||
"title": title,
|
||||
"description": description,
|
||||
"color": 0x00ff00,
|
||||
"fields": [],
|
||||
}
|
||||
for line in lines[:10]:
|
||||
embed["fields"].append({
|
||||
"name": line[:256],
|
||||
"value": "\u200b",
|
||||
"inline": False,
|
||||
})
|
||||
requests.post(webhook, json={"embeds": [embed]}, timeout=10)
|
||||
|
||||
|
||||
def build_test_pick() -> Dict:
|
||||
return {
|
||||
'title': f"[測試] Barron's 通知發送 - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
|
||||
@@ -76,4 +118,3 @@ def build_test_pick() -> Dict:
|
||||
'scraped_at': datetime.now().isoformat(),
|
||||
'hash': hashlib.md5(str(datetime.now().timestamp()).encode()).hexdigest()[:8],
|
||||
}
|
||||
|
||||
|
Reference in New Issue
Block a user