myisam加载数据的性能问题。
现在遇到一个关于myisam的性能问题,想请教一下。
以下是问题的详细描述,希望得到你的指点。谢谢。
问题:
如何优化查询数据到临时表的速度?客户要求是最好能达到5秒以内。
只要速度能快,不一定非要采用myisam,还请指点。
数据规模:
单表,表字段有70几个,记录数数为6000W条。用了近60G的存储空间。单条记录为1k左右大小。
基于mysql 5.5的myisam存储,数据为自增ID为主键,同时按每三百W条记录进行分区。
遇到的问题
根据lucene中检索出来的1w条记录id,id是相当分散的,然后通过
create table temptable select * from datatable where id in(1,2,3,4,5.....)
方式,将数据存入临时表。
基本上每次操作的时间都会需要用到近30秒
下面是profile的日志情况。
+----------------------+-----------+
| Status | Duration |
+----------------------+-----------+
| starting | 0.002119 |
| checking permissions | 0.000013 |
| Opening tables | 0.000512 |
| System lock | 0.000030 |
| Table lock | 0.000020 |
| init | 0.002095 |
| creating table | 0.023468 |
| After create | 0.000792 |
| System lock | 0.000006 |
| Table lock | 0.000025 |
| optimizing | 0.000371 |
| statistics | 5.897464 |
| preparing | 0.000147 |
| executing | 0.000005 |
| Sending data | 24.350588 |
| end | 0.000009 |
| query end | 0.000006 |
| freeing items | 0.000463 |
| logging slow query | 0.000008 |
| logging slow query | 0.000005 |
| cleaning up | 0.000548 |
+----------------------+-----------+
21 rows in set (0.00 sec)
运行机器的io及内存情况。
硬盘的IO
Timing cached reads: 1930 MB in 2.00 seconds = 965.19 MB/sec
Timing buffered disk reads: 712 MB in 3.01 seconds = 236.93 MB/sec
内存:12G
myisam的配置情况。
# Example MySQL config file for medium systems.
#
# This is for a system with little memory (32M - 64M) where MySQL plays
# an important part, or systems up to 128M where MySQL is used together with
# other programs (such as a web server)
#
# You can copy this file to
# /etc/my.cnf to set global options,
# mysql-data-dir/my.cnf to set server-specific options (in this
# installation this directory is /data/site/database/mysql/var) or
# ~/.my.cnf to set user-specific options.
#
# In this file, you can use all long options that a program supports.
# If you want to know which options a program supports, run the program
# with the "--help" option.
# The following options will be passed to all MySQL clients
[client]
#password = your_password
port = 3306
socket = /tmp/mysql.sock
# Here follows entries for some specific programs
# The MySQL server
[mysqld]
port = 3306
socket = /tmp/mysql.sock
skip-external-locking
key_buffer_size = 512M
max_allowed_packet = 2M
table_open_cache = 128
sort_buffer_size = 1M
net_buffer_length = 8K
read_buffer_size = 512K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
#skip-innodb
# Don't listen on a TCP/IP port at all. This can be a security enhancement,
# if all processes that need to connect to mysqld run on the same host.
# All interaction with mysqld must be made via Unix sockets or named pipes.
# Note that using this option without enabling named pipes on Windows
# (via the "enable-named-pipe" option) will render mysqld useless!
#
#skip-networking
# Replication Master Server (default)
# binary logging is required for replication
log-bin=mysql-bin
# binary logging format - mixed recommended
binlog_format=mixed
# required unique id between 1 and 2^32 - 1
# defaults to 1 if master-host is not set
# but will not function as a master if omitted
server-id = 1
# Replication Slave (comment out master section to use this)
#
# To configure this host as a replication slave, you can choose between
# two methods :
#
# 1) Use the CHANGE MASTER TO command (fully described in our manual) -
# the syntax is:
#
# CHANGE MASTER TO MASTER_HOST=, MASTER_PORT=,
# MASTER_USER=, MASTER_PASSWORD= ;
#
# where you replace , , by quoted strings and
# by the master's port number (3306 by default).
#
# Example:
#
# CHANGE MASTER TO MASTER_HOST='125.564.12.1', MASTER_PORT=3306,
# MASTER_USER='joe', MASTER_PASSWORD='secret';
#
# OR
#
# 2) Set the variables below. However, in case you choose this method, then
# start replication for the first time (even unsuccessfully, for example
# if you mistyped the password in master-password and the slave fails to
# connect), the slave will create a master.info file, and any later
# change in this file to the variables' values below will be ignored and
# overridden by the content of the master.info file, unless you shutdown
# the slave server, delete master.info and restart the slaver server.
# For that reason, you may want to leave the lines below untouched
# (commented) and instead use CHANGE MASTER TO (see above)
#
# required unique id between 2 and 2^32 - 1
# (and different from the master)
# defaults to 2 if master-host is set
# but will not function as a slave if omitted
#server-id = 2
#
# The replication master for this slave - required
#master-host =
#
# The username the slave will use for authentication when connecting
# to the master - required
#master-user =
#
# The password the slave will authenticate with when connecting to
# the master - required
#master-password =
#
# The port the master is listening on.
# optional - defaults to 3306
#master-port =
#
# binary logging - not required for slaves, but recommended
#log-bin=mysql-bin
# Point the following paths to different dedicated disks
#tmpdir = /tmp/
#log-bin = /path-to-dedicated-directory/hostname
# Uncomment the following if you are using InnoDB tables
#innodb_data_home_dir = /data/site/database/mysql/var/
#innodb_data_file_path = ibdata1:10M:autoextend
#innodb_log_group_home_dir = /data/site/database/mysql/var/
# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
#innodb_buffer_pool_size = 16M
#innodb_additional_mem_pool_size = 2M
# Set .._log_file_size to 25 % of buffer pool size
#innodb_log_file_size = 5M
#innodb_log_buffer_size = 8M
#innodb_flush_log_at_trx_commit = 1
#innodb_lock_wait_timeout = 50
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
# Remove the next comment character if you are not familiar with SQL
#safe-updates
#*** MyISAM Specific options
# Size of the Key Buffer, used to cache index blocks for MyISAM tables.
# Do not set it larger than 30% of your available memory, as some memory
# is also required by the OS to cache rows. Even if you're not using
# MyISAM tables, you should still set it to 8-64M as it will also be
# used for internal temporary disk tables.
key_buffer_size = 64M
# Size of the buffer used for doing full table scans of MyISAM tables.
# Allocated per thread, if a full scan is needed.
read_buffer_size = 4M
# When reading rows in sorted order after a sort, the rows are read
# through this buffer to avoid disk seeks. You can improve ORDER BY
# performance a lot, if set this to a high value.
# Allocated per thread, when needed.
read_rnd_buffer_size = 32M
# MyISAM uses special tree-like cache to make bulk inserts (that is,
# INSERT ... SELECT, INSERT ... VALUES (...), (...), ..., and LOAD DATA
# INFILE) faster. This variable limits the size of the cache tree in
# bytes per thread. Setting it to 0 will disable this optimisation. Do
# not set it larger than "key_buffer_size" for optimal performance.
# This buffer is allocated when a bulk insert is detected.
bulk_insert_buffer_size = 128M
# This buffer is allocated when MySQL needs to rebuild the index in
# REPAIR, OPTIMIZE, ALTER table statements as well as in LOAD DATA INFILE
# into an empty table. It is allocated per thread so be careful with
# large settings.
myisam_sort_buffer_size = 256M
# The maximum size of the temporary file MySQL is allowed to use while
# recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE.
# If the file-size would be bigger than this, the index will be created
# through the key cache (which is slower).
myisam_max_sort_file_size = 10G
# If a table has more than one index, MyISAM can use more than one
# thread to repair them by sorting in parallel. This makes sense if you
# have multiple CPUs and plenty of memory.
myisam_repair_threads = 1
# Automatically check and repair not properly closed MyISAM tables.
myisam_recover
[myisamchk]
key_buffer_size = 512M
sort_buffer_size = 512M
read_buffer = 8M
write_buffer = 8M
[mysqlhotcopy]
interactive-timeout
xlight
周五, 2010/05/07 - 03:10
Permalink
1w条记录 =1w次
1w条记录 =1w次 磁盘寻道 =0.02s * 1w = 20s
而60G数据在普通单台pc server上基本不可能完全载入内存
也许用多台server搞定,用gridsql之类的东西分布式查询
也许上SSD可以救你,
也许上小型机也能解决。(也许在外加个oracle,再加磁盘阵列)
但我没用过,只是一堆假设,我是来误人子弟的,哈哈
游客 (未验证)
周五, 2010/05/07 - 09:28
Permalink
ssd应该可以,我们用
ssd应该可以,我们用了ssd之后速度快了很多。而且现在80G的ssd也很便宜。
沙加 (未验证)
周三, 2010/05/19 - 14:22
Permalink
可以试试分布式的
可以试试分布式的 Key-Value 数据库,比如 Cassandra, 如果只是按主键查找速度是相当快的.