magento scalability from the trenches (meet magento sweden 2016)

Post on 08-Jan-2017

10.222 Views

Category:

Internet

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

MAGENTO SCALABILITY

from the trenches

Piotr Karwatka

AGENDA

1. General scalability rules2. Action Plan – scalability framework3. Magento B2B case

1. EAV and indexes,2. Cache3. Replication4. Fine-tuning

4. Magento 2.0

2

THE CHALLENGE

- Good architecture – a rare good,- There is no holy grail of scalability,

- Always take custom approach – measure before optimizing,

- Start “cheap”, scale fast – risky- Processes driven over improvisation,- Redundancy – scalability goes with availability

- Divide and conquer – using layers- Measure and examine bottlenecks,- Scale only overloaded layers

- Good news: Magento is scalable by design

3

middleware

cache

storage

app

db

HARDWARE APPROACH

At start – optimize code & use cache (New Relic, collected to catch bottlenecks); try HHVM, nginx, OpCache

Vertical: more RAM, more CPUs + no code changes required, fast gain- technology barriers, - at some point very expensive

Horizontal: more cheap servers+ high availability when done right,+ cloud ready, - often requires code refactoring,- challenging configuration and dev-ops

4

Cost at scale

ACTION PLAN

Step 1- use vertical scaling as far as it’s reasonable,- optimize code to avoid bottlenecks,- use caching where it’s possible,- separate database server- separate static files or/and use CDN,

Step 2- add additional app servers,- establish cache cluster,- use reverse proxy (Varnish)

Step 3- use database replication,- scale up using horizontal scaling 5

First go vertical

Then go horizontal

MAGENTO CASE – THE CHALLENGE

TIM.PL – largest B2B site in Poland. About 100 000 000EUR / yearPlatform for customers – offers/inquiries, bulk orders, near real-time CRM/WMS integration

6

- B2B e-Commerce site with external integrations (CRM, PIM, ERP, WMS)

- Up to 1.5M SKU’s,- Up to 2K active concurrent users,

average session time: 4h+,- About 6000 attributes,- About 2189 attribute sets,- 1M+ website calls / day,- Challenging read/write ratio: 50/50%

- B2B features, site used as tool/platform; browse/checkout scenario

We called it MVP.It worked well to some point...

7

FIRST APPROACH – 3 years ago

- Cache for blocks enabled,- FLAT enabled – but at 5000+ attributes InnoDB limits achieved,- The code was optimized quite well (we’ve used Ivan’s tips: http://www.

slideshare.net/ivanchepurnyi/making-magento-flying-like-a-rocket-a-set-of-valuable-tips-for-developers)

- Separated DB server + master-master replication (backup purposes),- SSD disks (APP + DB), lot of RAM (16GB / server) – vertical scaling

approach,- MySQL tuning (IO buffers, InnoDB buffers),- Apache tuning (connection limits, FPM)- HHVM tested – about +50% boost, but no profiling

8

OPTIMIZE AND PROFILE!

Always measure impact of change before implementing it to production- JMeter – we used it to emulate throughput and conduct load tests after each

change,- New Relic – to analyze application speed, track slow-queries and method-calls;

it can be used on production servers as well because of near-zero overhead

9

- Collectd – installed on both app and db servers – we’ve discovered bottlenecks on IO and db-locking on Magento’s product indexation,

- Logs – we used ELK (Kibana) and custom New Relic integration to diagnose web-services response times,

- htop, iotop – during IO problems it can be useful to find what generates the problem exactly,

- Xdebug/XHProf profiler - on stage servers to debug and profile code and discover cache gaps,

JMeter 2h load

tests

Fine tuning

JMeter 24h load

tests

Optimize one piece

at time

High availability is crucial – we switched to 2N model

10

master masterApp servers + GlusterFSboth servers can handle user reqs.

Haproxy + Varnish – load balancerload balancing and reverse proxy for caching and static files

APP & CACHE

- Redis is faster than memcached as backend cache,- Varnish (with ESI) is a must for both static files and page caching (we used

Turpentine and Phoenix on some projects – both are fine) - VCL can be challenging,- We managed to use HAProxy as load balancer (using automatic failover),- We’ve added cache to Mage_Catalog_Model_Product::load

- Consider adding cache to Mage_Eav_Model_Entity_Abstract to avoid EAV at all – we couldn’t use FLAT because of attributes count,

- We turned on FLAT to 900 most frequently used attributes (InnoDb limits),- Sessions were moved to Redis,- We discovered lot of queries to core_url_rewrite - cache should help here,- We used Fast-Async Reindexing module while using Magento 1.x to avoid

database locking- GlusterFS used to handle uploads and replication

11

VARNISH IMPACT

12

APP & CACHE

- Remarks- GlusterFS/network file systems – stat(), open() without local caching are IO

exhausting,- we had some issues with APC on PHP 5.4 (segfaults) – now everybody uses

OpCache ☺- at some point we switched from Apache to nginx + php-fpm to gain speed req/s

throughput and lower memory usage (read more here: http://info.magento.com/rs/magentocommerce/images/MagentoECG-PoweringMagentowithNgnixandPHP-FPM.pdf)

- We had problems with Magento API (really slow responses – 0.5s); optimizations = 0.2s + HHVM = 0.1s; next step – fast responding façade without Magento overhead - http://divante.co/blog/magento-1-9-1-0-page-load-time-0-3s/

- We had problems with Redis clogging with cache Keys (http://divante.co/blog/magento-clogged-redis-cache/)

13

HHVM IMPACT

14

THE HARD WAY

- Most challenging issues: EAV and indexing- Will be great to use NoSQL DB (MongoDB, SOLR),- At this point we use only model-level cache,

- We’ve disabled Magento logs and reports – less queries, less useless data to store,

- Small configuration tips make big difference:- query_cache_size - up to 128MB works well; furthermore – cache cleaning can

be really, REALLY slow- innodb_thread_concurrency - setting to 0 prevents MySQL from clogging

worker threads (looks like it’s locking but it isn’t)

- We switched from MySQL to PerconaDB/XtraDB- Great gain performance gain on peaks – queries count vs.

response time – up to + 275%,- No code / SQL changes required – 100% compatible with

MySQL,- MemSQL – looks really promising, not tested yet

15

DATABASE CAVEATS

16

Without FLAT in place – lot of EAV-related quires, also lot of URL-redirect related queries. Those queries are unnecessary.

HOW TO DISABLE EAV?

– it will be great if we can switch to NoSQL DB (like MongoDB, SOLR, Sphinx Search),

– one can overwrite EAV->FLAT indexers but it’s extremely hard (relations, some modules works on RAW SQL),

– suggestions:- Add cache to Product::load method – invalidation is

extremely important (you can use modification date in cache-key or observer based mechanism to clear it up),

- Add cache to load EAV attributes – for products, product categories,

- Overwrite/refactor Mage_Catalog – for searching and browsing products – some search modules do this partially,

- Great knowledge base about EAV: http://www.solvingmagento.com/magento-eav-system/

17

If you cannot use FLAT (categories + products are must) – it’s too slow or you have too many attributes

DATABASE SCALABILITY - REPLICATION

With replicas one gets: high availability, more req/s.It doesn’t fit all cases:

Caution: replication-lagsIt’s possible to move selected tables to external servers (like product catalogs).Always consider using cache first!

18

:-)

:-(

master slave

mastermaster

master

master

master

TB: users

TB: photos

INDEXATION VS. REPLICATION

- Master-slave replication shall help with db-locking issue;

- MySQL replicates only UPDATE/INSERT operations using binlogs

- this is extremely fast and doesn’t lock replicas

19

public function processEntityAction(Varien_Object $entity, $entityType, $eventType)... $resourceModel = Mage::getResourceSingleton('index/process'); $resourceModel->beginTransaction(); $this->_allowTableChanges = false; try { $this->indexEvent($event); $resourceModel->commit(); } catch (Exception $e) { $resourceModel->rollBack(); if ($allowTableChanges) { $this->_allowTableChanges = true; $this->_changeKeyStatus(true); $this->_currentEvent = null; } throw $e;

DATABASE – NEXT STEPS

- We’ve tested app-local master-slave replication to avoid network latency and database-locking– Magento supports this kind of replication out of the box,– Next step – move catalog database to separate server,– Route Admin panel requests to separated servers (using multi-

master Magento2 feature)

20

master masterApp servers + GlusterFS + PerconaDBlocal db-slave’s for read access Each server can handle user requestsHaproxy & Varnish

load balancer + proxy

Indexing, updates,Imports, RDBM

INTEGRATIONS

- We use queuing to avoid bottlenecks,- On each app server there are Gearman workers

(PHP processes) – responsible for getting prices, stocks, transferring orders,

- Workers exchange data with CRM, WMS, ERP, PIM in both async and sync modes – using priorities,

- We used Command/Task design pattern,- We log everything using ELK – especially

Kibana and New Relic to analyze external systems

- Magento API can be very challenging (it’s extremely slow)

21

MONITORING

We use Kibana (ELK stack) and custom New Relic metrics to monitor real-time integrations (CRM, WMS, ERP)Zabbix with Sellenium scripts is used to monitor and alert website availability

22

FINAL ARCHITECTURE

23

master masterApp servers + GlusterFS + PerconaDBlocal db-slave’s for read access Each server can handle user requests

Haproxy & Varnishload balancer + proxy

Gearman queue workershandle background jobs and externalintegrations

API calls

Web requests

External sys. Calls

background jobs

WHAT I’VE MISSED + MAGENTO 2

- Search – we used FactFinder / SOLR,- Details about Varnish and HHVM

- Life is going to be easier: What excites me in Magento2?– Materialized views engine – smarter indexation,– Full page caching in community,– Multi master DB contexts,– Checkout optimizations

24

THANK YOU! QUESTIONS?

25

Technical or scalability challenges? Contact me to consult your case for free!

Piotr Karwatka (pkarwatka@divante.pl)Divante – http://divante.co

top related