Apprentices progress at their own pace – they demonstrate
competency in skills and knowledge through assessment tests,
but are not required to complete a specific number of hours.
competency in skills and knowledge through assessment tests,
but are not required to complete a specific number of hours.
Cloud Operations Specialist 1
Anonymous
North Carolina (SAA)
Documents
View Redacted Document
Personal and employer info redacted
Individual state requirements may vary. Please contact your local apprenticeship office to ensure this version is suitable to your state’s requirements.
Work Process Content
On the Job Training
Anonymous
159
Skills
CONFIGURATION AND DEPLOYMENT
52
CONFIGURATION AND DEPLOYMENT
52
- Appropriate commands, structure, tools and automation/orchestration as needed.
- Platforms and Applications
- Interaction of cloud components and services: Network components, application components, storage components, compute components, security components
- Baselines
- Target hosts
- Existing systems
- Cloud architecture
- Cloud elements/target objects
- Apply the change management process: Approvals, Scheduling
- Refer to documentation and follow standard operating procedures
- Execute workflow
- Configure automation and orchestration, where appropriate, for the system being deployed
- Use commands and tools
- Document results.
- Underlying environmental considerations included in the testing plan. [ Shared components ( Storage, Compute, Network )], [Production vs. Deployment vs. QA, Sizing, Performance, High availability, Connectivity, Data integrity, Proper function, Replication, Load balancing, Automation/orchestration]
- Testing Techniques: Vulnerability testing, penetration testing, load testing.
- Consider success factor indicators of the testing environment: Sizing, Performance, Availability, Connectivity, Data integrity, Proper functionality.
- Document Results
- Baseline Comparisons
- SLA comparisons
- Cloud performance fluctuation variables.
- Cloud deployment models ( Public, Private, Hybrid, Community ).
- Network Components
- Applicable port and protocol considerations when extending to the cloud.
- Determine configuration for the applicable platforms as it applies to the network: VPN, IDS/IPS, DMZ, VXLAN, Address space required, Network segmentation and Micro segmentation.
- Available vs. proposed resources: CPU, RAM
- Memory technologies: Bursting and ballooning, Overcommitment ratio.
- CPU technologies: Hyperthreading, VT-x, Overcommitment ratio.
- Effect to HA/DR
- Performance considerations
- Effect to HA/DR
- Cost considerations
- Energy Savings
- Dedicated compute environment vs. shared compute environment
- Requested IOPS and read/write throughput.
- Protection capabilities: High availability [Failover zones], Storage replication[Regional, Multiregional, Synchronous and asynchronous], Storage mirroring, Cloning, Redundancy level/factor.
- Storage types: NAS, DAS, SAN, Object storage.
- Access protocols
- Management differences
- Provisioning model (Thick provisioned, thin provisioned, encryption requirements, tokenization).
- Storage technologies (Deduplication technologies, Compression technologies).
- Storage tiers
- Overcommitting storage
- Security configurations for applicable platforms ( ACLs, Obfuscation, Zoning, User/host authentication and authorization)
- Migration types ( P2V, V2V, V2P, P2P, Storage migrations, Online vs. offline migrations )
- Source and destination format of the workload ( Virtualization format, Application and data portability ).
- Network connections and data transfer methodologies
- Standard operating procedures for the workload migration
- Environmental constraints ( Bandwidth, Working hour restrictions, Downtime impact, Peak time frames, Legal restrictions, Follow-the-sun constraints/time zones.
- Identity management elements ( Identification, Authentication, Authorization [ Approvals, Access policy ], Federation [Single sign-on])
- Appropriate protocols given requirements
- Element considerations to deploy infrastructure services such as:( DNS, DHCP, Certificate services, Local agents, Antivirus, Load Balancer, Multifactor authentication, Firewall, IPS/IDS).
SECURITY
22
SECURITY
22
- Company security policies.
- Apply security standards for the selected platform
- Compliance and audit requirements governing the environment ( Laws and regulations as they apply to the data )
- Encryption technologies ( IPSec, SSL/TLS, Other ciphers)
- Key and certificate management (PKI)
- Tunneling protocols ( L2TP, PPTP, GRE)
- Implement automation and orchestration processes as applicable.
- Appropriate configuration for the application platform as it applies to compute (Disabling unneeded ports and services, Account management policies, Host-based/software firewalls, Antivirus/anti-malware software, Patching, Deactivating default accounts).
- Authorization to objects in the cloud (Processes, Resources [Users, Groups, System - compute, networks, storage - Services)
- Effect of cloud service models on security implementations
- Effect of cloud deployment models on security implementations
- Access control methods (Role-based administration, mandatory access controls, discretionary access controls, nondiscretionary access controls, multifactor authentication, single sign-on)
- Data classification
- Concepts of segmentation and micro segmentation ( Network, Storage, Compute )
- Use encryption as defined.
- Use multifactor authentication as defined.
- Apply defined audit/compliance requirements.
- Tools ( APIs, Vendor applications, CLI, Web GUI, Cloud Portal)
- Techniques ( Orchestration, Scripting, Custom programming )
- Security services ( Firewall, antivirus/anti-malware, IPS/IDS, HIPS )
- Impact of security tools to systems and services ( Scope of impact ).
- Impact of security automation techniques as they relate to the criticality of systems.(Scope of impact)
MAINTENANCE
19
MAINTENANCE
19
- Scope of cloud elements to be patched ( Hypervisors, virtual machines, virtual appliances, networking components, applications, storage components, clusters)
- Patching methodologies and standard operating procedures ( Production vs. development vs. QA ,Rolling update, Bluegreen deployment, failover cluster)
- Use order of operations as it pertains to elements that will be patched.
- Dependency considerations
- Types of updates (Hotfix, Patch, Version update, Rollback)
- Automation workflow ( Runbook management - Single node, Orchestration - multiple nodes, multiple runbooks )
- Activities to be performed by automation tools ( snapshot, cloning, patching, restarting, shut down, maintenance mode, enable/disable alerts.
- Backup types ( Snapshot/redirect-on-write, clone, full, differential, incremental, change block/delta tracking )
- Backup targets ( Replicas, local, remote )
- Other considerations ( SLAs, Backup schedule, configurations, objects, dependencies, online/offline )
- DR capabilities of a cloud service provider
- Other considerations ( SLAs for DR, RPO, RTO, Corporate guidelines, cloud service provider guidelines, bandwidth or ISP limitations, Techniques, site mirroring, replication, file transfer, archiving, third-party sites )
- Business continuity plan ( alternate sites, continuity of operations, connectivity, edge sites, equipment, availability, partners/third parties )
- SLAs for BCP and HA
- Maintenance schedules
- Impact and scope of maintenance tasks.
- Impact and scope of maintenance automation techniques.
- Include orchestration as appropriate.
- Maintenance automation tasks ( clearing logs, archiving logs, compressing drives, removing inactive accounts, removing stale DNS entries, removing orphaned resources, removing outdated rules from firewall, removing outdated rules from security, resource reclamation, maintain ACLs for the target object ).
MANAGEMENT
31
MANAGEMENT
31
- Monitoring ( Target object baselines, target object anomalies, common alert methods/messaging, alerting based on deviation from baseline, event collection ).
- Event correlation
- Forecasting resource capacity (upsize/increase, downsize/decrease)
- Policies in support of event collection
- Policies to communicate alerts appropriately
- Resources needed based on cloud deployment models (Hybrid, Community, Public, Private).
- Capacity/elasticity of cloud environment,
- Support agreements ( Cloud service model maintenance responsibility )
- Configuration management tool.
- Resource balancing techniques
- Change management ( advisory board, approval process, document actions taken - CMDB, Spreadsheets )
- Usage patterns
- Cloud bursting ( auto-scaling technology )
- Cloud provider migrations
- Extending cloud scope
- Application lifecycle (application deployment, application upgrade, application retirement, application replacement, application migration, application feature use - Increase/Decrease
- Business need change (Mergers/acquisitions/divestitures, cloud service requirement changes, impact of regulation and law changes)
- Identification
- Authentication methods ( Federation - single sign-on )
- Authorization methods ( ACLs, Permissions)
- Account lifecycle
- Account management policy ( lockout, passworkd complexity rules )
- Automation and orchestration activities ( user account creation, permission settings resource access, user account removal, user account disablement )
- Procedures to confirm results ( CPU usage, RAM usage, Storage utilization, patch versions, network utilization, application version, auditing enable, management tool compliance )
- Analyze performance trends
- Refer to baselines
- Refer to SLAs
- Tuning of cloud target objects ( compute, network, storage, service/application resources )
- Recommend changes to meet expected performance/capacity ( scale up/down (vertically), Scale in/out (horizontally)
- Chargeback/showback models ( Reporting based on company policies, reporting based on SLAs )
- Dashboard and reporting ( elasticity usage, connectivity, latency, capacity, overall utilization, cost, incidents, health, system availability - uptime/downtime )
TROUBLESHOOTING
35
TROUBLESHOOTING
35
- Common issues in the deployments ( breakdowns in the workflow, integration issues related to different cloud platforms, resource contention, connectivity issues, cloud service provider outage, licensing outages, template misconfiguration, time synchronization issues, language support, automation issues ).
- Exceeded cloud capacity boundaries ( Compute, storage, networking - IP address limitations | Bandwidth limitations, Licensing, variance in number of users, API request limit, batch job scheduling issues )
- Deviation from original baseline
- Unplanned expansions
- Breakdowns in workflow ( account mismatch issues, change management failure, server name changes, IP address changes, location changes, version/feature mismatch, automation tool incompatibility, job validation issue ).
- Common networking issues * Incorrect subnet, incorrect IP address, incorrect gateway, incorrect routing, DNS errors, QoS issues, misconfigured VLAN or VXLAN, misconfigured firewall rule, insufficient bandwidth, latency, misconfigured MTU/MSS, misconfigured proxy ).
- Network tool outputs,
- Network connectivity tools ( ping, tracert/traceroute, telnet, netstat, nslookup/dig, ipconfig/ifconfig, route, arp, ssh, tcpdump ).
- Remote access tools for troubleshooting.
- Authentication issues ( account lockout/expiration )
- Authorization issues
- Federation and single sign-on issues
- Certificate expiration
- Certification misconfiguration
- External attacks
- Internal attacks
- Privilege escalation
- Internal role change
- External role change,
- Security device failure
- Incorrect hardening settings
- Unencrypted communication
- Unauthorized physical access
- Unencrypted data
- Weak or obsolete security technologies
- Weak or obsolete security technologies
- Insufficient security controls and processes
- Tunneling or encryption issues
- Always consider corporate policies, procedures, and impacts before implementing changes.
- Identify the problem ( Question the user and identify user changes to computer and perform backups before making changes. )
- Establish a theory of probable cause ( question the obvious ) ((If necessary, conduct internal or external research based on symptoms))
- Test the theory to determine cause ( Once a theory is confirmed, determine the next steps to resolve the problem(, ( If the theory is not confirmed, reestablish a new theory or escalate ).
- Establish a plan of action to resolve the problem and implement the solution.
- Verify full system functionality, and if applicable, implement preventive measures.
- Document findings, actions, and outcomes.
Related Instruction Content
Training Provider(s):
Be Prepared America
Characteristics of Cloud Computing
Characteristics of Cloud Computing
Virtualization Technologies
Virtualization Technologies
Migration Planning
Migration Planning
Networking Concepts in the Cloud
Networking Concepts in the Cloud
Hybrid Cloud and Multi-Cloud Networking
Hybrid Cloud and Multi-Cloud Networking
Security Configurations
Security Configurations
Account Management
Account Management
Storage Types
Storage Types
Monitoring Resources
Monitoring Resources
Automation Workflow
Automation Workflow