open-source infrastructure at lyft
TRANSCRIPT
![Page 1: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/1.jpg)
Open-source Infrastructure at LyftConstance Caramanolis
Daniel Hochman July 2017
![Page 2: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/2.jpg)
Overview of Lyft Architecture
Open-source Infrastructure Projects
- Confidant
- Discovery
- Ratelimit
- Envoy
Q&A
Agenda
![Page 3: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/3.jpg)
Architecture (simplified)
Front Envoy Application
Envoy
DiscoveryConfidant
>100 Clusters
Ratelimit
![Page 4: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/4.jpg)
Python
lyft / confidant
Your secret keeper. Stores secrets in Dynamo, encrypted at rest.
1,105
12 contributors
November 2015
![Page 5: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/5.jpg)
How is a service configured?
lyft / location-service Private
common:
PORT: 8080
TIMEOUT_MS: 15000
development:
USE_AUTH: False
staging:
API_KEY: secret_key_igjq3i494fqq234qbc
production:
API_KEY: secret_key_ojajf823jj49ij8h
environment.yaml
![Page 6: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/6.jpg)
Servicelocation-service
Confidant to the rescue!
Credentialapi_key: password123
![Page 7: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/7.jpg)
Behind the scenes
Application
IAM Role
EC2 Instance
Credential
api_key: password123
api_key = os.getenv('CREDENTIAL_API_KEY')
KMS
DynamoDB
Confidant
![Page 8: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/8.jpg)
Server-blind secrets
Highly sensitive secrets are encrypted and decrypted by the end-users.
Confidant stores but can't read them.
Confidant
KMS
IAM Role
EC2 Instance
![Page 9: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/9.jpg)
lyft / discovery
Provides a REST interface for querying for the list of hosts that belong to a microservices
54
6 contributors
Python
August 2016
![Page 10: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/10.jpg)
POST /v1/registration/location-service
{
"ip": "10.0.0.1",
"port": 80,
"revision": "da08f35b",
"tags": {
"id": "i-910203",
"az": "us-east-1a",
"canary": true
}
}
Tracking hosts
* * * * *
![Page 11: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/11.jpg)
- Hosts are stored in DynamoDB
- Storage support is abstract
- Hosts removed if not reporting since now - HOST_TTL
- Ecosystem designed to tolerate eventual consistency
unlike Zookeeper, etcd, Consul
- Pair with active healthchecks
Storage
DynamoDB
![Page 12: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/12.jpg)
GET /v1/registration/<service>
{
"hosts": [
{
"ip": "10.0.0.1", "port": 80, "revision": "da08f35b",
"tags": {"id": "i-910203", "az": "us-east-1a", "canary": true}
},
...
{
"ip": "10.0.0.2", "port": 80, "revision": "da08f35b",
"tags": {"id": "i-121286", "az": "us-east-1d"}
}
]
}
Fetching hosts
![Page 13: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/13.jpg)
Services list the hosts they want to talk to!
internal_hosts:
- jobscheduler
- roads
external_hosts:
- dynamodb_iad
- kinesis_iad
Envoy per-service configuration
location-service/envoy.yaml
/etc/envoy.conf(on the box)
![Page 14: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/14.jpg)
Active Healthcheck
Application
Envoy
Discovery
jobscheduler
roads
GET /healthcheck
Application
Envoy
GET
GET
Every host healthchecks every host in a destination cluster
location-service
![Page 15: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/15.jpg)
lyft / ratelimit
Go/gRPC service designed to enable generic rate limit scenarios
224
6 contributors
Go
January 2017
![Page 16: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/16.jpg)
Why rate limit?
- Control flow
- Protect against attacks
- Bad actors
- Accidents happenoops
!
![Page 17: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/17.jpg)
Rate Limit Service
- Written in Go
- Enable generic rate limit
scenarios
- Decisions based on a domain
and set of descriptors
- Settings configured at runtime
- Backed by Redis
Ratelimit
?
INCR
![Page 18: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/18.jpg)
Domains and descriptors
Domain
Defines a container for a set of rate limits
Globally unique
e.g. "envoy_front"
Descriptors
Ordered list of key/value pairs
Case sensitive
e.g. ("destination_cluster", "location-service"), ("user_id", "1234")
![Page 19: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/19.jpg)
Limit definition
Runtime Setting
Defines the request per unit for a descriptor.
![Page 20: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/20.jpg)
Request flow example
Rq1: (“user_id”, “1234”)
Redis state: user_id_1234 : 1
Rs1: RateLimitResponse_OK
Rq2: (“user_id”, “9876”)
Redis state: user_id_1234: 1, user_id_9876 : 1
Rs2: RateLimitResponse_OK
Rq3: (“user_id”, “1234)
Redis state: user_id_1234: 2, user_id_9876 : 1
Rs3: RateLimitResponse_OVER_LIMIT
Definition
domain: test_domain
key: user_id
rate_limit:
unit: hour
requests_per_unit: 1
![Page 21: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/21.jpg)
Ratelimit Client
from lyft_idl.client.ratelimit.ratelimit_client import RateLimitClient
ratelimit_client = RateLimitClient(settings.LYFT_API_USER_AGENT)
# Determines whether or not to limit jsonp_messages_post according to ratelimit service.
def should_allow_jsonp_messages_post(ip_address, phone_number):
domain = settings.get('RATE_LIMIT_DOMAIN')
ip_descriptors = [(('jsonp_messages_post_from_ip_address', ip_address), )]
phone_descriptors = [(('jsonp_messages_post_from_phone_number', phone_number), )]
return (
ratelimit_client.is_request_allowed(domain, ip_descriptors) and
ratelimit_client.is_request_allowed(domain, phone_descriptors)
)
![Page 22: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/22.jpg)
lyft / envoy
Front/service L7 proxy
1,924
62 contributors
C++
September 2016
![Page 23: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/23.jpg)
Why Envoy?
Service Oriented Architecture
- Many languages and frameworks
- Protocols (HTTP/1, HTTP/2, databases, caching, etc…)
- Partial implementation of SoA best practices (retries, timeouts, …)
- Observability
- Load balancers (AWS, F5)
![Page 24: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/24.jpg)
What is Envoy?
The network should be transparent to applications.
When network and application problems do occur it
should be easy to determine the source of the problem.
![Page 25: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/25.jpg)
What is Envoy?
- Modern C++11
- Runs alongside applications
- Service discovery integration
- Rate Limit integration
- HTTP2 first (get gRPC!)
- Act as front/edge proxy
- Stats, Stats, Stats
- Logging
![Page 26: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/26.jpg)
Observability: Global Health
![Page 27: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/27.jpg)
Observability: Service to Service
![Page 28: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/28.jpg)
Envoy Client in Python (internal)
from lyft.api_client import EnvoyClient
switchboard_client = EnvoyClient(
service='switchboard'
)
switchboard_client.post(
"/v2/messages",
data={
'template': 'welcome'
},
headers={
'x-lyft-user-id': 12345647363394
}
)
![Page 29: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/29.jpg)
Envoy deployment @Lyft
- > 100 services
- > 10,000 hosts
- > 2,000,000 RPS
- All service to service traffic (REST and gRPC)
- MongoDB, DynamoDB, Redis proxy
- External service proxy (AWS and other partners)
- Kibana/Elastic Search for logging.
- LightStep for tracing
- Wavefront for stats
![Page 30: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/30.jpg)
Architecture RevisitedFront Envoy
Application
Envoy
DiscoveryConfidant
>100 Clusters
Ratelimit
![Page 31: Open-source Infrastructure at Lyft](https://reader033.vdocument.in/reader033/viewer/2022051301/5a649b167f8b9a31568b595f/html5/thumbnails/31.jpg)
Done!
- Lyft is hiring. If you want to work on large-scale problems in a fast-moving,
high-growth company visit lyft.com/jobs
- Visit github.com/lyft
- Slides available at slideshare.net/danielhochman
- Q&A