(1)java 如何escape HTML代码
如何转义HTML标签
/** * 去除HTML字串中的控制字符及不可视字符 * * @param str * HTML字串 * @return 返回的字串 */ public static String escapeHTML(String str) { int length = str.length(); int newLength = length; boolean someCharacterEscaped = false; for (int i = 0; i < length; i++) { char c = str.charAt(i); int cint = 0xffff & c; if (cint < 32) switch (c) { case 11: default: newLength--; someCharacterEscaped = true; break; case ‘\t‘: case ‘\n‘: case ‘\f‘: case ‘\r‘: break; } else switch (c) { case ‘"‘: newLength += 5; someCharacterEscaped = true; break; case ‘&‘: case ‘\‘‘: newLength += 4; someCharacterEscaped = true; break; case ‘<‘: case ‘>‘: newLength += 3; someCharacterEscaped = true; break; } } if (!someCharacterEscaped) return str; StringBuffer sb = new StringBuffer(newLength); for (int i = 0; i < length; i++) { char c = str.charAt(i); int cint = 0xffff & c; if (cint < 32) switch (c) { case ‘\t‘: case ‘\n‘: case ‘\f‘: case ‘\r‘: sb.append(c); break; } else switch (c) { case ‘"‘: sb.append("""); break; case ‘\‘‘: sb.append("'"); break; case ‘&‘: sb.append("&"); break; case ‘<‘: sb.append("<"); break; case ‘>‘: sb.append(">"); break; default: sb.append(c); break; } } return sb.toString(); }
?测试:
@Test public void test_001(){ String input="<html><input type=\"button\" onlick=\"abc()\" > </html>"; System.out.println(input); System.out.println(StringUtil.escapeHTML(input)); }
?运行结果:
?
?
(2)java 如何去除html标签,只留下文本
/** * 删除input字符串中的html格式 * * @param input * @param length * 显示的字符的个数 * @return */ public static String splitAndFilterString(String input, int length) { if (input == null || input.trim().equals("")) { return ""; } // 去掉所有html元素, String str = input.replaceAll("\\&[a-zA-Z]{1,10};", "").replaceAll( "<[^>]*>", ""); str = str.replaceAll("[(/>)<]", ""); int len = str.length(); if (len <= length) { return str; } else { str = str.substring(0, length); str += "......"; } return str; } /** * 返回纯文本,去掉html的所有标签,并且去掉空行 * * @param input * @return */ public static String splitAndFilterString(String input) { if (input == null || input.trim().equals("")) { return ""; } // 去掉所有html元素, String str = input.replaceAll("\\&[a-zA-Z]{1,10};", "").replaceAll( "<[^>]*>", ""); str = str.replaceAll("[(/>)<]", ""); return SystemHWUtil.deleteCRLF(str); } /*** * Delete all spaces * * @param input * @return */ public static String deleteAllCRLF(String input) { return input.replaceAll("((\r\n)|\n)[\\s\t ]*", "").replaceAll( "^((\r\n)|\n)", ""); } /** * delete CRLF; delete empty line ;delete blank lines * * @param input * @return */ public static String deleteCRLF(String input) { input = SystemHWUtil.deleteCRLFOnce(input); return SystemHWUtil.deleteCRLFOnce(input); }
?见类com\common\util\SystemHWUtil.java
源代码见附件
原文:http://hw1287789687.iteye.com/blog/2175951