managing and monitoring teampage
DESCRIPTION
Chris Nuzum, Traction Software. Traction User Group, Oct 15 2010, Newport RI. TUG 2010 Newport slides, agenda and more see www.TractionSoftware.comTRANSCRIPT
Managing Teampage
2
Topics
• Process structure
• Non-plugin local server modifications
• Settings overview
• Setup interfaces
• Q&A
3
TeamPage — Processes
• 3 Java Proceses
• StartTraction wrapper process — java.exe — lower
process id
• Invokes Traction, monitors and proxies output, restarts
• Traction process — java.exe — higher process id
• JavaDB Network Server — javaw.exe
• Plus native Windows Service/Linux Daemon
4
Locking
• Only one server should ever run against a journal at a
time
• Otherwise journal file can be inconsistently
numbered, requiring manual repair
• Two safeguards
• Lock file — if present, server won’t start
• Removed by JVM on clean exit
• Socket — if bound, server won’t start
5
Windows Startup
• Automatic
• Windows Service — runs as
System
• Unlock Traction on Boot
• removes lockfile
• Traction
• runs StartTraction
wrapper
• runs Traction
• console output - traction.out.txt
• Manual
• Windows Application
• TractionApplication
• runs StartTraction
wrapper
• runs Traction
• console output - console
6
Unix Startup
• Automatic
• /etc/init.d/traction start
• TractionDaemon
• runs StartTraction
wrapper
• runs Traction
• console output - traction.out.txt
• Manual
• command line
• TractionApplication
• runs StartTraction
wrapper
• runs Traction
• console output - console
7
Shutdown
• Windows
• Stop service
• Stops StartTraction
• Traction listens for
heartbeat from StartTraction
• If no heartbeat for
specified period, exits
with restart code
• If no StartTraction,
restart = exit
• Unix
• /etc/init.d/traction stop
• kill -2 {Traction process ID}
• invokes clean shutdown
handler
• All, Preferred
• Shutdown Traction button in
Server Setup
8
Shutdown Process
• Finish current work, then halt thread
• Appending journal
• Reading mailbox
• Serving files
• In error state, work may be stuck
• Force shutdown via web page
• Last resort, force shutdown via operating system
• Manually clear lock file after ensuring processes gone
9
Local Mods
• Most configuration managed by TeamPage interfaces,
can be overridden or extend in plug-ins.
• Exceptions
• Adding support for new mimetypes — modify
mime.types
• Supporting new user-agents, modify
useragent.properties
• Re-apply modifications after upgrade
10
Settings Overview
11
Tour of Admin Interfaces
• Server Setup
• Project Setup
• Personal Setup
• Skin Setup
• Plug-in Options
TeamPage Monitoring & Debugging
13
Normal EKG
14
Flatlining
15
Support2478
Support 2478: I use FAST Search and my TeamPage server has suddenly become very slow or has reported OutOfMemory errors. How do I recover?
16
Walkthrough of Tools
• Console output - traction.out.txt
• Log files
• logs/debug.log
• logs/statistics.log
• graphing memory usage - Support668
• Thread manager
• kill -3 — Java thread+monitor dump
• JConsole — install JDK, run with -Dcom.sun.management.jmxremote
Backup & Availability Planning
18
• Considerations
• Recommendations
• Discussion - learn from each other
19
Considerations
• Back up FAQ is Doc19
• Entire installation, not just journal data
• Considerable time invested in server settings,
config files, don’t want to have to recreate
• Open files – Windows Volume Shadow Copy
• Time window
• Files can change during backup, require rebuild
on restore to ensure consistency
20
Recommendations
• VMWare snapshots address consistency issue, can be
run online, allow rollback, and can be mirrored remotely
• Recommend machine state as well as disks,
otherwise rebuild required on restore
• scheduled rsync
• Incremental remote mirror
• ZFS snapshots, export
• AWS EC2 snapshots
Traction Authentication & Authorization
22
Security Principals
• Identify users & groups
• traction:u:18, traction:g:24, ad:g:52
• Groups defined by principal, recursively
• traction:g:24 = { ad:g:52, traction:u:42 }
• ACLs defined over principals
• traction:g:18 allow publish own
• Each user has exactly one security principal
23
Local Users & Groups
• Stored in journal
• Cached in memory
24
External Users & Groups
• Defined, managed externally
• Active Directory, LDAP, and others supported
• Cached in Principal Cache
• Downloaded at startup, updated asynchronously
at defined interval
• Force update by clearing cache with
/type cachemanager
25
Customizable Queries
• User directory configuration defines how lookups
are done
• Depending on directory server, changing queries
can dramatically improve performance
26
Extensible Architecture
• Login Manager
• Determine credentials based on request
• Authenticator
• Determine whether credentials are valid
27
Hybrid Login Managers
• Handle different types of connections differently
• Dispatch based on skin, user-agent, URI path
• Realms — HTTP basic auth, login always required
• OpenRealms — HTTP basic auth, login optional
hybrid_realms=com.traction.admin.RealmsLoginManagerhybrid_realms_useragents=securerobot,attachmentsrobothybrid_realms_servlet_paths=/webdav,/db
hybrid_open_realms=com.traction.admin.OpenRealmsLoginManagerhybrid_open_realms_skins=rss,rss091,rss092,rss10,rss20,atom,icalhybrid_open_realms_useragents=rss,robot,calendar
28
Hybrid Authenticator
• Switches based on principal
• Handle both local Traction users and AD/LDAP
users
29
Simple Login Managers
• Cookies — Encrypted cookie sent to browser
• Can be encoded with IP address of client
• Realms — HTTP Basic
• Most secure over HTTPS
30
Single Sign-on
• LDAP X.509 Client Certificates
• HTTPS provides cert, determines principal
• Lookup user in LDAP, make sure cert matches
• NTLM
• After handshake, browser provides hash code
• Validate hash with AD server
31
Single Sign-on
• NTLMv2
• Via commercial library
• Emulates protocol Windows workstations use to
allow users to log in
• More secure, more robust
32
Federation
• NTLM RunAs authenticator for use with existing
Enterprise Search federators, e.g. Vivisimo
• Authenticate service account via NTLM, run as
user performing the search
Performance Tuning
34
Memory
• Garbage collection burns CPU
• Flushing caches requires reloading caches from
disk
• Run 64-bit
• 32-bit limited to ~1.5GB heap
• For best performance, heap should be less than
physical RAM
35
Caching in Traction
• Cache
• config/**, plugins/**/config/**
• Entry tokens
• Permissions
• Principals
• Users
• Group membership
36
Finding what’s slow on a page
• Enable timing debug
• View page source
37
Tuning Label Driven Sections
• Use label-based queries when searching/filtering
for labels
• :todo i(:todo and :r42)
38
Use a Smaller Default Timeslice
39
Turn off Project Counts
• Permission filtered, can be expensive to calculate
40
Hide WebDAV Sidebar
• Page doesn’t complete drawing until WebDAV
request complete
41
Disable WebDAV Auto Refresh
42
Offload JavaDB
• Run in a different process
• Run on a different computer
• Customer2050
43
Offload Metrics Reports
• Export entire JavaDB
• Run in a different instance
Making the Most of Metrics
45
Making the Most of Metrics
• Tour of metrics
• Hit counter
• Top articles
• Viewed by
• Browsing History
• Controlling who can access
detailed metrics reports
• Report controls
• Report details
• Exporting CSV
• Rebuilding indexes
• Q&A