API/System Performance
API Performance:
Code design:
UI Pagination:
Implement pagination for UI display of large result sets by dividing the results into numbered pages. This allows the user to view a manageable subset of results per page as opposed to all results on one long page. Pagination improves usability and performance when displaying potentially thousands of results by only loading a small page of results at a time.
Code Optimization:
- Remove unnecessary computations and function calls that do not contribute to the output.
- Refactor database queries to only retrieve required fields and add indexes to improve query speed.
- Minimize disk I/O by caching frequently accessed data and batching reads/writes.
- Optimize iterative algorithms by eliminating unnecessary loop iterations and using efficient data structures like hash maps.
- Avoid allocating memory unnecessarily within high-frequency functions or loops.
- Analyze algorithms for efficiency and replace expensive operations with more optimal solutions.
- Use data structures like arrays, hash maps, and binary trees rather than linked lists where appropriate.
DB Optimization:
- Create indexes on columns frequently used in query criteria to speed up lookups.
- Analyze complex queries and add optimizations like joins, aggregations, and window functions to improve efficiency.
- Implement caching mechanisms to avoid expensive query recomputation and minimize trips to the database.
- Design schemas and queries to only extract the required columns and fields, avoiding expensive full table scans.
- Structure queries to take advantage of database engine optimizations like parallelization and partition pruning.
Parallel Processing:
- Use multithreading to execute independent tasks concurrently within the application.
- Implement asynchronous I/O and operations so that requests do not block on long-running tasks.
- Distribute work across threads, processes, or machines to scale out on multi-core/multi-machine architectures.
- Optimize resource usage through thread pools, async queues, and parallelized data structures.
- Avoid shared state and mutable data that require expensive synchronization.
- Use non-blocking data structures and algorithms like concurrent queues.
Use appropriate libraries and frameworks:
- Use standard libraries or robust 3rd party options for common needs like HTTP clients, caching, logging, and configuration.
- For machine learning, data processing, etc. adopt highly optimized libraries like TensorFlow, Pandas, NumPy rather than custom code.
- Build on high-performance application frameworks suited for the specific language/task instead of raw sockets/threads.
- Thoroughly evaluate libraries for quality, compatibility, and active maintenance before adoption.
- Contribute fixes and improvements back to the open-source projects used.
System design:
Logging:
Logs=>Buffer=>Disk
Write logs to a lock-free ring buffer in memory to avoid blocking on disk I/O. This provides higher throughput and lower latency for logging. Periodically flush the in-memory buffer to disk to persist the logs. The lock-free ring buffer enables concurrent access for writing logs without synchronization overhead while flushing to disk asynchronously offloads I/O work from the critical path. This approach combines the performance benefits of in-memory logging with the durability of disk-based persistence.
Caching:
read from db => read from cache <= update cache
- Cache frequently used data in memory to avoid unnecessary database queries.
- On a cache miss, query the database and add the result to the cache.
- Manage cache invalidation, eviction policies, and refresh strategies.
- Choose appropriate caching libraries like Memcached or Redis.
- Implement cache aside, read through, and write through patterns as appropriate.
Load Balancing:
- Use a load balancer to monitor server health and route traffic accordingly.
Recommended by LinkedIn
- Scale horizontally by adding or removing servers to adjust capacity dynamically.
- Employ strategies like round-robin, least connections, or application-aware routing.
- Enable session persistence when needed to direct specific users' requests accordingly.
Parallel DB Query Processing:
- Enable the database to utilize multiple CPUs/cores.
- Use hints to selectively parallelize complex queries that will benefit.
- Design tables and indexes to take advantage of performance gains from parallelism.
- Avoid over-parallelization that can impede performance due to context switching overhead.
- Analyze query plans to identify parallel execution opportunities.
DB Configuration:
- Allocate sufficient memory and set cache sizes based on database workload and hardware.
- Set optimal concurrency parameters for parallelism, threads, and connections.
- Tune I/O using storage striping, read/writes throttling, and asynchronous I/O.
- Enable and size database caching features appropriately such as buffer pool, and plan cache.
- Leverage database advisors and tools to identify and apply recommended tuning for slow queries.
Optimization of network:
- Compress API responses and transferred data to reduce bandwidth usage.
- Leverage a CDN to cache and distribute static content closer to users.
- Securely encrypt all traffic using efficient algorithms like ECDHE and AES-GCM.
- Monitor network metrics like latency, errors, and saturation to identify bottlenecks.
Monitoring and Profiling:
- Track the performance of bottlenecks like databases, caches, and networks.
- Monitor VM/container resource usage like CPU, memory, disk, and network I/O.
- Implement tracing to follow request flows across services.
- Set performance budgets and alerts to be notified of regressions.
- Use APM tools to isolate and debug performance problems in production.
Horizontal Scaling:
- Launch additional application server instances to share the increasing load.
- Leverage cloud infrastructure to dynamically auto-scale capacity up or down.
- Distribute requests across servers using load balancing and reverse proxies.
- Design stateless components to easily scale out without synchronization bottlenecks.
- Partition or share data across distributed databases.
- Take advantage of cloud-managed services like object stores, queues, and caches.
Throttling and Rate Limiting:
- Limit the number of requests per client to prevent abuse and costly overuse.
- Apply different thresholds based on request types to allocate capacity.
- Dynamically adjust limits based on overall system load using real-time metrics.
- Enforce quotas on a timeframe like daily, monthly, or per billing cycle.
Connection pool:
Request => connection pool => DB
- Maintain a pool of open database connections for reuse instead of opening and closing connections per request (DB Proxy).
- Eliminate the overhead of repeated connection creation and teardown which adds latency.
- Tune pool configuration for optimal size, timeout settings, and concurrency based on usage.
- Scale up the pool as the load increases to maintain performance.