this is your postgresql on drugs
DESCRIPTION
PostgreSQL is known to be a powerful open source relational database with many uses. One such use is warehousing EMRs (Electronic Medical Records) from oncology practices across the country. PostgreSQL, Perl, Apache, Ubuntu Linux, and OpenBSD are all used for their strengths to deliver information to pharmaceutical companies to see what their drugs are doing for individuals in real world scenarios.Do you have a large amount of data that needs to be searchable, aggregated, and extremely secure at the same time? See many of the creative solutions that have been deployed to help facilitate how we put PostgreSQL to the task of drugs.TRANSCRIPT
AaronThulElectronicMedicalOfficeLogistics(EMOL)http://chasingnuts.com/oscon1.08.pdf
Sorrynofreesamples
339
WhoamI?
Computer&DatabaseGeek,justlikeyou FormerOSCONPresenter PresentlyaITmanagerataEMOL PostgreSQLEvangelist PenguiconOrganizer
WithPostgreSQLandOSSEMOLis
DatacollectionfromEMRsandothersources AidinginAdherencetostandards ProvidingPhysicianandPracticelevelbenchmarking
DataBrokering EnablingAutomationofNationalinitiatives Improvingpatientcare
EMOLPostgreSQLData
PatientRecords BillingRecords LabResults ClinicalRecords InventoryManagement PatientReportedData
Metadata
PhysiciansDictations ScannedDocuments Images
XRAYs MIRIs CATScans
MetadataStorage
ReiserFSwithtailpackingEachpractice/doctorhasafolder
SUNOpenSolaris&ZFS???LinuxandXFS???NetappWaffle???
EMOLSoftware
UbuntuLinuxLTS(8.04) PostgreSQL(8.3) Perl(5.8.x) WindowsUnifiedDataStorageServer2003
YesWindows
EMOLHardware
HPProCurveSwitches SonicWallFirewalls&IDS LargenumberofSCSIandSATAHardDrives iSCSIServersandDAS
WhyPostgreSQL?
CapableRequiredFeaturesDatabaseTeamExperienceSecurityCommunity
DocumentationProject MailingLists IRC EventsLikeThis!
WhyPostgreSQL?
WhyPerl?
DevelopmentteamexperiencedwithPerl Unix‐centric,andavailableforWindows Textparsingandnormalizing IknowitPerlisnotsexylikeINSERT ‘new_popular_language’ INTO languages;
WhoisWhere?
OSandPostgreSQLbinariesonlocaldisks RAID1Mirror 15kspindledrives EXT3
WhoisWhere?
WALBuffersonlocaldisks RAID1Mirror 15kspindlespeed EXT2
WhoisWhere?
INDEXs DAS(DirectAttachedStorage)Units RAID6 10kspindlespeedSCSI EXT3
WhoisWhere?
TABLES MultipleiSCSIServersonSANS 4x1GigabitEthernetInterfacesBonded 8x1TerabyteSATAdrivesperSANNodeRAID6 EXT3
DataDaily
Loading10GBdatadailyintoPostgreSQLLoading10GBmetadatadaily
DataSize
SELECT relname, (relpages*8)/1024 as MB
FROM pg_class
ORDER BY relpages DESC;
DataSize
SELECT relname, (relpages*8)/1024 as MB
FROM pg_class
ORDER BY relpages DESC;
Thisdoesnotaccountforpg_toast
Thisdoesprovidemoreprecision
DataSizeReally
SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_relation_size(nspname || '.' || relname)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND nspname !~ '^pg_toast'
AND pg_relation_size(nspname || '.' || relname)>0
ORDER BY pg_relation_size(nspname || '.' || relname) DESC
Howmuchdataarewetalking
LargestTable:1,844.73GBSecondLargestTable:1,289.36GB
Howmuchdataarewetalking
LargestIndex:411.91GBSecondLargestIndex:405.08GB
Howmuchdataarewetalking
TotalDBsizeondisk:16,800.39GB
BettermakesureweneedthatINDEXselect
indexrelid::regclass as index, relid::regclass as table
from
pg_stat_user_indexes
JOIN pg_index USING (indexrelid)
where
idx_scan = 0 and indisunique is false;
Moredetailsat:
http://people.planetpostgresql.org/xzilla/index.php?/archives/351‐Index‐pruning‐techniques.html
Runittwiceandmakeitfaster
Maintaina1/500setofrandomsampledataALLquerieshitthatdatabasefirst
HowdoIsleepatnight
FirstNameLastNamesSocialSecurityNumbersBirthDates
Neededtotrackpeopleovertimeandgeography
HowdoIsleepatnight
"Bydefault,PostgreSQLisprobablythemostsecurity‐awaredatabaseavailable..."
DatabaseHacker'sHandbook
ProtectingtheWarehouse
Simpleprocessesthatarefollowed IntrusionPrevention&Firewalls SecurityMonitoring&Management‐MSSP EncryptedCommunication Centralizedmanagementofusersandgroups
mitigatesvulnerabilitiesthatoccurduetoinconsistencies
ProtectingtheWarehouse
Role‐basedsecurity SECURITYDEFINERFunctionswherewecan Identitydatasymmetricallyencrypted Dataisanonamizedinallbutafewtables Role‐basedsecurityandschemas Alldataisanonamizedbeforeitissentout
PostgreSQLscaling
Sizematters:Yahooclaims2‐petabytedatabaseisworld'sbiggest,busiest
PostgreSQLscaling
BasedonamodifiedPostgreSQLengine,theyear‐olddatabaseprocesses24billioneventsaday,accordingtoWaqarHasan,vicepresidentofengineeringinYahoo'sdatagroup.
PostgreSQLscaling
GridSQLfromEnterpriseDB BuiltusingmultiplestandardPostgreSQLservers OpenSourceProject
LessonsLearned
ServerEthernetCardsarenotallmadethesame
With100+drivesbereadytoRMAsomedisks
Youcan’thavetobigacacheonyourRAIDcontroller
MoreLessonsLearned
pg_resetxlog isnotTHATscaryDon’teverusethis!!!
YoucanneverhavetomanyPCI‐XSlots
Auto‐vacuumisnotalwaysyourfriend
MoreLessonsLearned
Worrywhenadevelopersays“Ihaveanidea”
Somemistakesarejusttomuchfuntomakeonlyonce
MoreLessonsLearned
Iamusedtohearing“Itseemslikeyouaredoingsomethingfundamentallywrong”
Neveraskfordirectionsfromatwo‐headedtourist!
‐BigBird
Questions
Web:http://www.chasingnuts.comEmail:[email protected]:AaronThulonirc.freenode.orgJabber:[email protected]:@AaronThulAIM:AaronThul