Eason Zhao
9ece101d8c
feat: alert manager set-up for all services
2025-10-20 21:31:25 -04:00
Nicolas
ec93fc987c
feat(opentelemetry): add additional Kubernetes labels and enhance logging attributes
2025-09-28 09:07:42 +08:00
Nicolas
921c9f4e10
feat(config): add ENVIRONMENT variable to central-storage-config.yaml for enhanced configuration management
2025-09-26 15:55:23 +08:00
Nicolas
beb355249a
fix: update centralStorage OpenTelemetry configuration
...
- Change start_at from 'end' to 'beginning' for complete log history
- Fix transform configuration to match authentication service
- Add k8s_cluster receiver to collect container stdout logs
- Remove problematic json_parser operator
- Ensure consistent log processing across services
2025-09-23 08:58:01 +08:00
Nicolas
cd9f42e143
Changed the secret configuration of central storage
2025-08-18 17:38:47 +08:00
Nicolas
69a2c112d1
feat(centralStorage): migrate alpha environment to use Azure Key Vault for sensitive data
...
- Add FreeleapsSecret configuration for Azure Key Vault integration
- Move sensitive data (mongodbUri, azureStorageDocumentApiKey, azureStorageDocumentApiEndpoint) from config to secrets
- Update deployment template to read from both config and FreeleapsSecret
- Comment out sensitive fields in central-storage-config.yaml
- Create freeleapssecret.yaml template for secret management
2025-08-18 16:24:11 +08:00
Nicolas
aa74f6a4f7
fix: remove invalid k8scluster receiver to resolve OpenTelemetry startup error
...
- Remove k8scluster receiver (invalid type name)
- Remove unused otlp receiver definitions
- Keep only filelog receiver which is actually used in pipeline
- Fixes CrashLoopBackOff issue in central-storage pod
2025-08-01 07:52:37 +08:00
Nicolas
d85f9408e4
fix: change Chinese comments to English in OpenTelemetry configs
...
- Replace Chinese comments with English in opentelemetry.yaml files
- Maintain consistent English-only comments in freeleaps-ops repository
- Keep the same functionality while improving code readability
2025-07-31 23:17:45 +08:00
Nicolas
5101e6d2cd
fix: optimize OpenTelemetry configuration to prevent duplicate log collection
...
- Change receivers from [filelog, otlp, k8scluster] to [filelog] only
- Prevent duplicate logs from multiple collection sources
- Keep only filelog receiver to collect logs from application log files
- This eliminates duplicate logs appearing in Grafana for devsvc and central-storage
2025-07-31 23:08:10 +08:00
Nicolas
cc73ad92a9
feat: add container logs collection to OpenTelemetry config for central storage
2025-07-31 11:27:28 +08:00
Nicolas
da7e43547f
fix: use hasKey to safely check for logIngest.persistence existence
2025-07-31 10:21:50 +08:00
Nicolas
85fa39f8e2
fix: add default values for logIngest.persistence to prevent nil pointer errors
2025-07-31 10:18:47 +08:00
Nicolas
7a7fcf1398
fix: optimize central storage logging configuration to resolve hourly log bursts
...
- Change start_at from beginning to end to avoid historical log duplication
- Add poll_interval: 1s for real-time file monitoring
- Optimize batch processing: send_batch_size: 1, timeout: 1s
- Add PVC template for log persistence to reduce log loss risk
- Update deployment to support persistent volume for logs
2025-07-31 10:14:55 +08:00
Nicolas
f9c2c7e696
Change OTC start_at to beginning to read existing logs
2025-07-29 18:04:53 +08:00
Nicolas
401a2166cd
Simplify OTC config: remove timestamp parsing to fix startup issues
2025-07-29 17:57:30 +08:00
Nicolas
7ae5966da1
Fix timestamp parsing: use move operator to extract timestamp from attributes
2025-07-29 17:54:25 +08:00
Nicolas
12a7f1b98c
Fix timestamp parsing in OTC config for central-storage logs
2025-07-29 17:40:57 +08:00
Nicolas
c4bd3df730
Fix OTLP exporter config: remove unsupported labels options
2025-07-29 17:22:40 +08:00
Nicolas
18eb4df3fb
Revert to OTLP config to fix OTC startup - will investigate Loki config later
2025-07-29 17:16:46 +08:00
Nicolas
d234716697
Comment out Loki labels config to fix OTC startup - can be enabled later
2025-07-29 17:13:01 +08:00
Nicolas
b1c9f59cff
Simplify Loki exporter configuration to fix OTC startup issues
2025-07-29 17:08:32 +08:00
Nicolas
a1d5652efb
Fix Loki exporter configuration: remove invalid keys
2025-07-29 17:05:24 +08:00
Nicolas
b7b6ddcae4
Fix central-storage logging: change OTLP exporter to Loki HTTP exporter
2025-07-29 16:47:47 +08:00
Nicolas
acd5379c27
fix: restore json_parser with timestamp parsing based on local testing
2025-07-29 15:40:26 +08:00
Nicolas
beab7cd117
fix: remove all parsing operators to use raw log content
2025-07-29 15:29:53 +08:00
Nicolas
37db3b4404
fix: remove time_parser to use default timestamp handling
2025-07-29 14:46:13 +08:00
Nicolas
5479b85968
fix: separate timestamp parsing from json_parser to resolve time parsing error
2025-07-29 14:42:29 +08:00
Nicolas
aa8e626b7a
fix: remove problematic ParseJSON operations from OpenTelemetry transform processor
2025-07-29 14:30:00 +08:00
Nicolas
22ab4b99ef
fix: simplify OpenTelemetry Collector config to resolve timestamp parsing errors
2025-07-29 14:07:39 +08:00
Nicolas
5ced605a20
fix: improve OpenTelemetry Collector JSON parsing for central storage logs
2025-07-29 10:15:39 +08:00
Nicolas
605e66a26b
fix: add missing logging environment variables for central storage K8s deployment
...
- Add LOG_BASE_PATH environment variable
- Add BACKEND_LOG_FILE_NAME environment variable
- Add APPLICATION_ACTIVITY_LOG environment variable
- Fix logging path configuration for K8s environment
2025-07-29 09:36:01 +08:00
Nicolas
8fc5281e23
feat: add DEBUG_MODE environment variable for central-storage
2025-07-28 21:00:33 +08:00
Nicolas
381d25c11b
chore(logging): update logPathPattern to /var/log/pods/*/*/*.log for k8s standard log collection
2025-07-25 11:05:28 +08:00
zhenyus
3aad06eeb7
feat: add Vertical Pod Autoscaler (VPA) configuration across multiple services for improved resource management
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-06-10 23:53:31 +08:00
zhenyus
6d9d15d4d2
Add OpenTelemetry logging support across multiple services
...
- Introduced log ingestion configuration in values files for centralStorage, content, notification, and payment services.
- Updated deployment templates to conditionally include OpenTelemetry annotations and volume mounts based on log ingestion settings.
- Added OpenTelemetry RBAC configurations for service accounts and cluster roles to enable logging.
- Implemented OpenTelemetry collector configuration to process logs and export them to Loki.
- Ensured compatibility with existing Helm chart structure and maintained backward compatibility for services without log ingestion enabled.
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-04-21 22:03:00 +08:00
zhenyus
b3ad25d6e5
fix: update config file paths in config-checksum annotations for deployment templates
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-27 14:51:41 +08:00
zhenyus
df2ab1c3a4
fix: correct config file path in config-checksum annotations for deployment templates
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-27 14:50:08 +08:00
zhenyus
580f3f8d71
feat: add config checksum annotations to deployment templates and update site URL in values files
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-27 14:48:25 +08:00
zhenyus
a90ee717b2
fix: update Prometheus query intervals from 30s to 1m for improved accuracy
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-18 05:59:05 +08:00
zhenyus
336f1fa0e2
fix: remove hardcoded uid values in dashboard templates for consistency
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-18 00:26:16 +08:00
zhenyus
a1ab488cbf
fix: update dashboard file names to use dynamic values for consistency
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-18 00:25:04 +08:00
zhenyus
a6803210e0
feat: add dashboard configuration for content, payment, notification, and central storage services
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-18 00:22:06 +08:00
zhenyus
56094aa1bd
fix: remove version and managed-by labels from service monitor templates for consistency
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-17 23:53:49 +08:00
zhenyus
e8a07a08e6
feat: add metricsEnabled and probesEnabled configuration options to payment, content, centralStorage, authentication, and notification services
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-17 23:51:35 +08:00
zhenyus
96ab638756
fix: remove serviceMonitor.enabled variable from service templates in authentication, centralStorage, content, notification, and payment services
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-17 23:47:29 +08:00
zhenyus
e7bb64108c
fix: rename service monitor resources to use a consistent naming convention
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-17 23:45:47 +08:00
zhenyus
bb2bebc164
fix: update service monitor templates to use service-specific values for namespace, labels, interval, and scrapeTimeout
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-17 23:45:16 +08:00
zhenyus
c9cfa0827e
fix: update service monitor templates to check individual serviceMonitor.enabled values
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-17 23:42:46 +08:00
zhenyus
32198e2f9a
fix: remove port definitions from service monitor configurations
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-17 23:36:46 +08:00
zhenyus
5c9f74c609
fix: add name label to service monitor configuration
...
Signed-off-by: zhenyus <zhenyus@mathmast.com>
2025-03-17 23:26:34 +08:00