OpenSearch and Elasticsearch in PeopleSoft
Modern Search Architecture Design Patterns
If you’ve been working with PeopleSoft recently, you’ve probably noticed Oracle pushing toward modern search capabilities with Elasticsearch support. It’s one of the most significant shifts in PeopleSoft’s search infrastructure in over a decade.
But here’s the challenge: Oracle tells you that you can implement these technologies, not how to make the right decisions for your environment. After implementing OpenSearch for PeopleSoft, I want to share the patterns and strategies that actually work in production.
Why This Matters
Traditional PeopleSoft search relied on database queries—functional, but increasingly inadequate as data volumes grow and user expectations evolve. Modern users expect Google-like experiences: fast, relevant results delivered in milliseconds, with fuzzy matching that handles typos gracefully.
Elasticsearch and OpenSearch provide sub-second query performance across millions of records, relevance ranking that surfaces useful results first, and scalability through distributed architecture. Oracle added Elasticsearch support in PeopleTools 8.55, and OpenSearch (an open-source fork) has become increasingly relevant for organizations wanting to avoid licensing complications.
Key Decisions You’ll Face
Elasticsearch vs. OpenSearch
PeopleSoft officially supports Elasticsearch, but OpenSearch maintains API compatibility and can work as a drop-in replacement for many implementations.
OpenSearch advantages: Fully open-source (Apache 2.0), no licensing restrictions, strong backing from AWS.
Elasticsearch advantages: Officially documented and supported by Oracle, fewer unknowns.
Bottom line: For new implementations, OpenSearch offers compelling open-source benefits. If you need full Oracle support coverage, stick with Elasticsearch.
Deployment Architecture
You have two main options:
Single-Node (Dev/Test Only): Quick to set up, minimal resources, but no high availability. Fine for development, not for production.
Multi-Node Cluster (Production): Multiple nodes for redundancy and performance. Separate master nodes (cluster coordination) from data nodes (indexing and queries). This is what you want for production.
Recommendation: Use managed services unless you have specific requirements demanding self-management. Your time is better spent on business problems than managing search infrastructure.
Database Integration Strategy
How you sync data from PeopleSoft to OpenSearch significantly impacts performance and reliability. Three patterns work:
Real-Time Synchronous: PeopleCode updates OpenSearch immediately during save operations. Simple but adds latency to transactions and increases failure points. Only for small datasets with infrequent updates.
Asynchronous with Message Queues: Decouple database updates from search indexing using message queues. Near-real-time updates without impacting save performance. Best balance for most implementations.
Scheduled Batch Sync: Periodic bulk updates via Application Engine. Efficient for large datasets where real-time search isn’t critical. Good for initial loads and reference data.
Recommendation: Use asynchronous queues for most scenarios. You get near-real-time updates without performance penalties.
Database Considerations
Your database needs to support effective search integration. Focus on these areas:
Create Denormalized Search Views
Instead of OpenSearch joining 5+ tables at query time, create flattened views that combine all the data needed for indexing. A single view for employee search might combine personal data, job information, department details, and contact information. This makes indexing faster and reduces load on production tables.
Track What Changed
You need to know what changed since the last index update. Two approaches:
Timestamp tracking: Add a LAST_UPDATE_DTTM column to search views. Simple and works for most cases.
Change capture tables: Maintain separate tables tracking what changed. More reliable for high-volume updates.
Start with timestamps. Move to more sophisticated approaches only if needed.
Index Appropriately
Index your search views on the fields you’ll use to query them—typically the primary key and the timestamp field. Partition large tables by date range if you’re dealing with millions of records. Standard database performance practices apply here.
Index Design Basics
How you structure OpenSearch indices affects performance and relevance.
Define Your Mappings
Don’t rely on automatic field detection. Explicitly define how each field should be indexed:
Text fields for full-text search (like names, descriptions). Keyword fields for exact matches and filtering (like IDs, status codes). Many fields benefit from both—store as text for searching but also as keyword for exact filtering.
Choose appropriate analyzers for your data. Standard analyzer works for most English text. Use language-specific analyzers if you’re searching non-English content.
Index Lifecycle Management
Use index aliases for zero-downtime reindexing. Point your application at an alias (like “employees_active”) instead of a specific index. When you need to rebuild, create a new index version, verify it’s good, then switch the alias. Delete the old version once the switch is complete.
For time-series data, implement lifecycle policies that automatically move old data to cheaper storage tiers and eventually delete it based on age.
Performance Optimization
Indexing Performance
Use bulk operations. Never index documents one at a time. Batch them into groups of 500-1000 and send them together. This can improve performance by 50-100x.
Disable refresh during bulk loads. When doing initial loads or large rebuilds, temporarily disable the index refresh interval. Re-enable it when done. This dramatically speeds up bulk indexing.
Query Performance
Use filters instead of queries when you can. Filters (exact matches, ranges) are cached and faster than scored queries. Only use scored queries for actual text search.
Limit result sizes. Don’t fetch 10,000 results when you’re showing 20. For deep pagination, use search_after instead of from/size pagination.
Boost relevance intelligently. Give higher scores to documents that are more useful—active records over inactive, recent updates over old ones. This surfaces better results first.
Cluster Sizing
Plan for 1.5-2x your source data size (accounting for replicas and overhead). Allocate 50% of node RAM to JVM heap (max 32GB), leave the other 50% for OS file cache. Start with one shard per node and adjust based on shard size—aim for 20-50GB per shard.
Example: For 100GB of search data with one replica, you need about 200GB storage, 3 nodes with 64GB RAM each (32GB heap), and 6-9 shards total.
Integrating with PeopleSoft
Three approaches work in practice:
Search Framework Integration: Use PeopleSoft’s built-in Search Framework with custom PeopleCode that calls OpenSearch REST APIs. Transform results back to PeopleSoft format. Good for simple implementations.
Integration Broker: Define service operations for index updates and route them to OpenSearch. Trigger from save events in PeopleCode. Handles async communication well. Better for complex scenarios.
Middleware Layer: For complex environments, add a middleware layer between PeopleSoft and OpenSearch. Handles transformation, error handling, rate limiting, and monitoring. More architecture but more robust.
Security Essentials
Transport encryption: Always use TLS for communication between PeopleSoft and OpenSearch. Use certificate-based authentication where possible.
Access control: Implement role-based access in OpenSearch. Use document-level security to filter results by user permissions. Don’t index sensitive data that users shouldn’t see.
Network security: Place OpenSearch in private subnets. Use firewalls to restrict access to only application servers that need it.
Audit logging: Enable audit logs. Track who searches what and when. Monitor for unusual patterns.
Monitoring and Maintenance
Monitor cluster health: Green (good), Yellow (degraded), Red (critical). Set up alerts for yellow/red states.
Watch performance metrics: Query latency, indexing throughput, resource utilization (CPU, memory, disk).
Regular maintenance: Back up using snapshot/restore API. Force merge old indices. Review capacity quarterly. Test restore procedures before you need them.
Common Mistakes to Avoid
Over-sharding: Too many small shards hurts performance. Fewer, larger shards usually work better.
Under-sizing heap: Give your JVM adequate memory or performance suffers.
Synchronous indexing: Don’t make users wait for search indexing during save operations.
No monitoring: You can’t fix problems you don’t know about. Set up monitoring from day one.
Ignoring replicas: Replicas provide redundancy. Place them on different nodes.
Untested backups: Test your restore procedures before disaster strikes.
Getting Started
The key decisions are straightforward:
Choose OpenSearch or Elasticsearch based on your licensing preferences and support requirements. OpenSearch works well for most scenarios.
Use managed services unless you have compelling reasons for self-management. Your time is valuable.
Design asynchronous integration to protect transaction performance. Message queues are your friend.
Create denormalized views in your database to simplify indexing and reduce load.
Size appropriately for growth, not just current state. Plan for 18-24 months.
Start small—implement search for a single business object, measure performance, learn. Then expand as you build confidence. The organizations that succeed don’t try to boil the ocean. They start with one use case, do it well, and iterate.
What search challenges are you facing in your PeopleSoft environment? Share your experiences in the comments.
Want more deep dives into PeopleSoft architecture? Subscribe for practical guides on modern PeopleSoft technologies.



