As a Linux operation and maintenance, more or less will encounter such problems or failures, sum up experience, find problems, summarize and analyze the cause of the failure, this is a good habit of a Linux operation and maintenance engineer. Every breakthrough in technology has experienced depression, accompanied by happiness, but we continue to work hard and continue to accumulate more experience. This is the rich reward that practice gives us.
The following summarizes the faults and solutions that may occur during my project. See if I am resonating with you and helpful to you?
First: common problem solving highlights
1. Shell script does not execute Question: One day, a colleague researched and asked me to help him to see the shell script he wrote, and he did not execute it, and reported an error. I looked at it, the script is very simple, there is no regular error, and the report ":badinterpreter:Nosuchfileordirectory" is wrong. Looking at this mistake, I asked him if he was writing a script under Windows and then uploading it to the Linux server... sure enough. Reason: In DOS/windows, the newline character of the text file is rn, and in the *nix system it is n, so the text file edited in DOS/Windows is in *nix, and each line has an additional ^M. Solution: 1) Rewrite the script under linux; 2) vi: %s/r//g: %s/^M//g (^M input with Ctrl+v, Ctrl+m) Attach: sh-x script File names that can be single-stepped and echoed back to help troubleshoot complex scripting issues.
2.crontab output result control
Problem: The /var/spool/clientmqueue directory takes up more than 100G. Cause: The program executed in cron has output, the output will be sent to the cron user as a mail, and sendmail is not started, so /var/spool/clientmqueue is generated. Those files in the directory may accumulate over the disk over time. Solution: 1) Direct manual deletion: ls|xargsrm-f; 2) Complete solution: add >/dev/null2>&1 after the auto-execute statement of cron
3. telnet is very slow / ssh is very slow Problem: One day research and development of a colleague said 10.50 access to 10.52memcached service exception, let us check if the network / service / system is abnormal. Check that the system is normal, the service is normal, 10.50ping10.52 is also normal, but 10.50telnet10.52 is very slow. Also found that the machine's namesever does not work. Reason: becauseyourPCdoesn'tdoareverseDNSlookuponyourIPthen...whenyoutelnet/ftpintoyourlinuxbox,it'lldoadnslookuponyou. Solution: 1) Modify /etc/hosts to make hostname and ip correspond; 2) Comment out nameserver in /etc/resolv.conf or find a "live" nameserver.
4.Read-onlyfilesystem Problem: Colleagues built a table in mysql is not successful, the prompt is as follows: mysql>createtablewosontest(colddname1char(1)); ERROR1005(HY000):Can'tcreatetable'wosontest'(errno:30) After checking mysql user Permissions and related directory permissions are no problem; use perror30 prompt information: OSerrorcode30: Read-onlyfilesystem Possible reasons: 1) file system corruption; 2) disk bad sectors; 3) fstab file configuration error, such as partition format error error (will ntfs Written as fat), configuration instructions spelling errors, etc. Solution: 1) Because it is a test machine, restart after restarting the machine; 2) On the Internet, use mount to solve.
5. The file is deleted and the disk space is not released. Problem: One day, a machine has found that the df-h has used 90 GB of disk space, and du-sh/* shows that all the used space is added up to 30G. Cause: Someone may directly delete a file being written by rm, causing the file to be deleted but the disk space is not released. Solution: 1) The most simple to restart the system or restart related services. 2) Kill the process /usr/sbin/lsof|grepdeleted ora25575data33uREG65,654294983680/oradata/DATAPRE/UNDOTBS009.dbf(deleted) From the output of lsof, we can find that the process with pid 25575 holds the file description number (fd) The file opened for 33 /oradata/DATAPRE/UNDOTBS009.dbf. After we find this file, we can release the occupied space by ending the process: echo>/proc/25575/fd/33 3) Delete the file being written generally with cat/dev/null>file
6.find file to improve performance Problem: There are a lot of temporary files containing picture_* in the tmp directory, and the files before one day are cleaned every night at 2:30. I ran the following script under crontab, but found that the script is very inefficient, and the load surges every time it is executed, affecting other services. #!/bin/sh find/tmp-name "picture_*"-mtime+1-execrm-f{}; Cause: There are a large number of files in the directory, and it is very resource-intensive to use find. Solution: #!/bin/sh cd/tmp time=`date-d"2dayago" "+%b%d"` ls-l|grep "picture"|grep "$time"|awk'{print$NF} '|xargsrm-rf
7. Can't get the gateway mac address Problem: From 2.14 to 3.65 (map address 2.141) the network is unreachable, but from other machines on the 3 side to 3.65 network OK. Reason: #arp AddressHWtypeHWaddressFlagsMaskIface 192.168.3.254etherincompletCMbond0 The surface phenomenon is that the machine can not automatically obtain the gateway MAC address. The network engineer said that it is a network device problem, which is not clear. Solution: arp binding, arp-ibond0-s192.168.3.25400:00:5e:00:01:64
8.http service can not start an example Question: One day, a colleague said that the website front-end environment http could not be started, I went up and looked down. Report the following error: /etc/init.d/httpdstart Startinghttpd:[SatJan2917:49:002011][warn]moduleantibot_moduleisalreadyloaded,skipping Useproxyforwardasremoteip:true. Antibotexcludepattern:.*.[(js|css|jpg|gif|png)] Antibotseedcheckpattern :login (98)Addressalreadyinuse:make_sock:couldnotbindtoaddress[::]:7080 (98)Addressalreadyinuse:make_sock:couldnotbindtoaddress0.0.0.0:7080 nolisteningsocketsavailable,shuttingdown Unabletoopenlog[FAILED] Reason: 1) Port is occupied: the surface is 7080 Occupied, so netstat-npl|grep7080 looked down and found that 7080 is not occupied; 2) repeatedly wrote the port in the configuration file, if the following two files are written at the same time Listen7080 /etc/httpd/conf/http.conf /etc /httpd/conf.d/t.10086.cn.conf Solution: Comment out the Listen7080 of /etc/httpd/conf.d/t.10086.cn.conf, restart, OK.
9.toomanyopenfile Problem: Report toomanyopenfile error resolution: The ultimate solution echo "">>/etc/security/limits.conf echo "*softnproc65535">>/etc/security/limits.conf echo "*hardnproc65535">>/etc /security/limits.conf echo"*softnofile65535">>/etc/security/limits.conf echo"*hardnofile65535">>/etc/security/limits.conf echo">>>>/root/.bash_profile echo"ulimit- N65535">>/root/.bash_profile echo"ulimit-u65535">>/root/.bash_profile Finally restart the machine or execute ulimit-u655345&&ulimit-n65535
10.ibdata1 and mysql-bin caused disk space problems: 2.51 disk space alarm, found that ibdata1 and mysql-bin log take up too much space (where ibdata1 exceeds 120G, mysql-bin exceeds 80G) Cause: ibdata1 is the storage format, In the INNODB type data state, ibdata1 is used to store the data and index of the file, and the table files in the folder of the library name are just structures. The innodb storage engine has two tablespace management methods, namely: 1) shared tablespace (which can be split into multiple small tablespace files), which is the method used by most of our current databases; 2) independent tablespaces, Each table has a separate table space (disk file). There are advantages and disadvantages for the two management methods, as follows: 1 Shared table space: Advantages: The table space can be divided into multiple files and stored on different disks (Table The size of the spatial file is not limited by the size of the table, a table can be distributed on the files that are not synchronized. Disadvantages: All data and indexes are stored in one file, and as the data increases, there will be a large file, although You can divide a large file into multiple small files, but multiple tables and indexes are mixed and stored in the table space, so if there is a large number of deletes in a table space after a large number of delete operations on a table. In the shared table space management mode, once the table space is allocated, it can no longer be retracted. When there is an expansion of the operation table space for temporarily building an index or creating a temporary table, it is impossible to retract the space by deleting the related table. 2 independent table space: set in the configuration file (my.cnf): innodb_file_per_table Features: Each table has its own independent table space; each table's data and index will exist in its own table space. Advantages: The disk space corresponding to the table space can be reclaimed (Droptable operation automatically reclaims the table space. If the table after deleting a large amount of data can pass: altertabletbl_nameengine=innodb; retract the unused space. Disadvantage: If the single table is increased too much, such as Performance is also affected by more than 100G. In this case, if you use shared tablespaces to separate files, there is also a problem. If the scope of access is too large, multiple files will be accessed, which will be slower. If you use a separate tablespace, you can consider using the partition table method to alleviate the problem to some extent. In addition, when the independent tablespace mode is enabled, you need to adjust the setting of the innodb_open_files parameter reasonably. Solution: 1) ibdata1 data is too large: only Through dump, export the SQL statement of the database, and then rebuild the method. 2) mysql-binLog is too large: 1 manual delete: delete a log: mysql> PURGEMASTERLOGSTO 'mysql-bin.010'; delete the log a day ago: mysql> PURGEMASTERLOGSBEFORE '2010-12-2213:00: 00 '; 2 Set the bin-log log that only saves N days in /etc/my.cnf expire_logs_days=30//BinaryLog automatically deletes the number of days
Second, the troubleshooting summary table
Serial number | Point of failure | Analysis and resolution |
1 | When the Linux system is installed in the initial state, the hard disk cannot be found and cannot be installed in the next step. | Enter the COMS settings, find the relevant options for the hard disk settings, and set to compatibility mode |
2 | When the Linux system is installed, after the hard disk partition is completed, the installation cannot be continued. | The hard disk partition does not meet the installation requirements, you may have forgotten to create a root partition or a swap swap partition, which is different from the installation of the Windows system. |
3 | When the Linux system is installed, the selection of the software package is confusing during the installation. After the installation is completed, it is found that it does not meet our requirements. Some components are not installed, but the components that are not needed are installed. | There is still too little understanding of the Linux system. After repeated installations, it is naturally free to master. |
4 | During the configuration process of the proxy server, some filtering plans were found to be inoperative. | (1) First check whether the corresponding function module is loaded successfully (2) Whether the default policy is set properly (3) Whether the iptables command syntax is wrong (4) The filtering planning order may be improper, and needs to be adjusted. |
5 | After the configuration of the proxy server and firewall is completed, the service is started, and the Internet can be accessed, but the service in the DMZ area cannot be accessed. | (1) Close the iptables service to see if it can be accessed. If not, check the connectivity. If it can be accessed, the iptables rule has a problem. Check the configuration and order of the filtering rules. |
6 | After configuring the iptables filtering rule again, after restarting the iptables service, it is found that all the original rules are lost. | (1) Modify the /etc/sysconfig/iptables-config configuration file and change IPTABLES_SAVE_ON_RESTART=â€no†to yes(2) with iptables-save > /etc/sysconfig/iptables command |
7 | After the VLAN is divided on the switch, you cannot access the external network. | The gateway of the VLAN is not set or set incorrectly |
8 | The named service cannot be started in the configuration DNS service. | Possible problem: (1) Missing necessary files in the /etc/named directory (2) Missing necessary files in the /var/named directory (3) Named account permissions. Workaround: Missing files must be copied in place, the startup file must have permissions set to named account and group account |
9 | The domain name or IP address cannot be correctly resolved in the configuration of the DNS service. | (1) Check and modify the syntax and record settings in the forward parsing area file and the reverse parsing area file under /var/named (2) Check if the zone area declaration in the /etc/named.conf configuration is written incorrectly ( 3) Check if the bind-chroot package is installed. If it is installed, the zone database file should be in the /var/named/chroot/var/named directory. (4) Check if the /etc/resolv.conf configuration file is set correctly. Nameserver |
10 | When the dhcpd service starts, it prompts "No subnet declaration for eth0(10.10.10.2)" | The IP address of eth0 is incorrectly set. It is not within the scope of the dhcp service. The IP address of eth0 must be set to the IP address in the scope. |
11 | When configuring the DHCP service, multiple scopes are configured. As a result, only one scope address can be assigned, and others cannot be assigned successfully. | The host has only one network interface card. If there are three scopes, you need to configure three NIC interfaces eth0, eth1, and eth2, which correspond to three scopes. This is a configuration method using super scope |
12 | The installation of the MySQL database is not successful, always prompting the software dependencies, causing the packages to be installed to not be installed smoothly. | Note that the package to be installed requires the support of other components or shared libraries. The MySQL rpm package installation method itself is cumbersome. It requires more packages to be installed, and the dependencies between packages are very obvious. Find the required component packages according to the prompts. And install, pay attention to the package order when installing |
13 | Test the web service. When visiting the main site, no web page appears, but the server is already connected. | The "DocumentRoot" option in the httpd.conf main configuration file is not set properly, such as /var/, and the last "/" cannot be added. |
14 | The remote client cannot access the samba shared directory, and the shared directory is successfully tested locally. | Turn off the iptables service |
15 | Samba's smb service has been successfully started. When accessing a shared directory of samba, the error message "NT_STATUS_BAD_NETWORK_NAME" is displayed. | Description shared directory is not created or does not exist |
16 | Samba's smb service has been started successfully, prompting the error message "NT_STATUS_ACCESS_DENIED" | Prompt access is denied, the login user name or password may be incorrect, or iptables is started, close the firewall |
17 | Samba's smb service has started successfully, prompting the error message "NT_STATUS_LOGON_FAILURE" | The current user is not allowed to access the current shared directory, indicating that this shared directory setting only allows specific users to access |
18 | The FTP service is configured with local user uploads, but the prompt is rejected when uploading data to the corresponding directory. | Maybe the user account does not have write access to the upload directory. |
19 | After configuring the local account to log in to FTP, the root account cannot log in, and the error message "500 OOPS:cannot change directory:/root" is displayed, and other local accounts can log in to FTP. | Check if the SELinux security system is enabled and disable SELinux. You can edit the /etc/selinux/config file and change the configuration item SELINUX=enforcing to disabled. |
20 | Use a mail client to send mail but not receive mail | Check if the pop3 service is started. |
twenty one | The mount command mounts the shared directory of the NFS service. It has not responded for a long time. The NFS service is normal. | The portmap service is not started and must be started. |
twenty two | The local test mount mounts the NFS share successfully, but it is unsuccessful when other client hosts mount the connection. | Close the iptables service and test again |
24v wall charger,24v dc adapter,24v ac dc adapter,24v switching adapter,100-240V AC to DC 24V 3A 72W Power Adapter,12W Ac Switching Power Adapter,24V 0.5A Power Supply For Led Lights
Shenzhen Waweis Technology Co., Ltd. , https://www.huaweishiadapter.com