Designing enterprise infrastructure that fits the requirements of multiple teams for workload performance can be a challenge.
IT departments strive to have in place the equipment and tools all departments need in order to work effectively and efficiently. As workloads become more demanding, balancing resources without creating friction—while ensuring all department needs are still met—is one of the most difficult, yet essential, tasks.
Modern tools to support new workloads, like artificial intelligence (AI) and machine learning (ML), can make things even harder. They often require more resources, which could critically impact other areas of an organization.
Capacity planning is one of the most important elements to proper workload design. Here are three common areas to get you started on your journey to workload performance:
When planning for capacity with performance in mind, you must first have visibility into your existing environment. Identifying your current assets across the entire application, tools, and hardware stack, as well as evaluating how those assets are accessed, is a key first step.
In general, you want to identify:
- Applications that require special hardware tools, like GPUs, FPGAs, or InfiniBand
- Mission critical applications that are required to be running 24/7 without interruption
- Non-mission critical applications that can yield to higher priority systems
- Data sources, storage, and access patterns
2. Application assessment
The next step is to clearly articulate your new application dependencies and performance requirements. This includes:
- I/O, bandwidth, storage subsystem types, and other performance characteristics being evaluated by all teams utilizing the new workload
- A thorough understanding of what your organization is currently using and what you will need to meet future goals
- Creating a dynamic design to help ensure that as workloads grow and change your infrastructure can adapt
- Allowing workloads to run when and where they’re needed, which can include leveraging modern resource scheduling technologies such as Kubernetes to greatly aid in the effective distribution of resources and workloads
3. Data assessment
The third step is to evaluate the data access type and size anchoring of your workloads and various applications. This means:
- Key metrics like response times, queries, SLAs, and access permissions
- Subsystems in place to feed modern tools like GPUs. For example, high end GPU servers like Dell EMC’s PowerEdge C4140 can require high bandwidth networking as well as all flash storage systems such as Dell EMC’s Isilon platforms to properly utilize GPUs effectively
- Local storage capacity and traditional networking, such has multi-node compute and HCI platforms for large-scale workloads
- Understanding what types of data you are working with, which can play a big role in the design of your infrastructure
While this list is not exhaustive, we’ve found that these tenets should be a part of every workload capacity planning exercise.
If your organization is ready to design an infrastructure for AI and cloud-native workloads, download our free eBook to learn more.
Keep up with Redapt
- Enterprise Infrastructure
- Data & Analytics
- Cloud Adoption
- Cloud Native
- Workplace Modernization
- Code Development
- Multi-Cloud Operations
- Google Cloud Platform (GCP)
- Tech We Like
- Security & Governance
- Dell EMC
- IoT and Edge
- Managed Services
- Business Transformation
- Microsoft Azure
- Emerging Tech