Friday, January 21, 2011

HTTP GET

Ran into a nasty problem with one of our webapps recently.

We have some jobs that enable and disable a number of weblogic entities (OSB proxies/ JMS queue/ Soa Composites). We noticed that some of these were failing.

The exception we saw was ERROR: Another session operation is in progress. Please retry later.

As part of our process (to disable OSB proxies) we need to create and use a weblogic session. We only create one instance, and then activate the changes at the end. But in our logs a 2nd session was been created.

After careful inspection it turned out the second session was been generated by a different thread. Checking the access.log confirmed a 2nd request arriving 30 seconds after the first. (This maybe certain versions of Firefox.. Need more investigation)

This it appears is because we were using a HTTP Get (we should be using Post) for our request. Apparently on seeing no response/ ack etc, the browser was re-submiting the request on its own accord.

So now we had 2 conflicting requests both trying to access the weblogic session. Obviously the 2nd fails. This was then reported to the user.

Meanwhile the original request succeeds silently.

Our fix is to change the request to a post.

Also adding a token to the post request to prevent against double submit.

….

2011-01-18 20:51:21,335 [DEBUG] myClass:[ACTIVE] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)':- Realizing changes in the session: WebAdminTool1295383871917

2011-01-18 20:51:21,759 [DEBUG] myClass:[ACTIVE] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)':- Setting service OSBProxy::false to true

2011-01-18 20:51:28,551 [INFO ] ie.bge.middleware.jmswebtool.data.impl.jmx.QueuePersistenceJmxImpl:[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)':- Result from pauseConsumption: null

2011-01-18 20:51:28,554 [INFO ] ie.bge.middleware.jmswebtool.data.impl.jmx.QueuePersistenceJmxImpl:[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)':- Result from pauseConsumption: null

...

2011-01-18 20:51:28,644 [DEBUG] myClass :[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)':- Creating new Weblogic session WebAdminTool1295383888606

2011-01-18 20:51:48,669 [ERROR] myClass:[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)':- ERROR: Another session operation is in progress. Please retry later.

...

2011-01-18 20:51:58,571 [DEBUG] myClass:[ACTIVE] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)':- Activated changes in the session: WebAdminTool1295383871917

Monday, January 17, 2011

Free memory Grid Control Agent

One problem we had with our Linux systems is: “where is all the memory gone?”

We’ve been there in the past and it looks like the Grid Control (GC) Agent suffers from exactly this misunderstanding.

In our case we have:

- Each of our PROD boxes has 32 GB (GiB to be precise) of physical memory

- They all have 16GB of swap space

- None of them is really using the swap space

- Two out of three report less than 1GB of free memory

o one 166MB

o two 720MB

o three 3978MB

- All three are using 20GB or more memory for caching

o one 23 GB

o two 21GB

o three 20GB

So, where is the memory gone? Caching of course!

Check memory using the free command or reading the MemFree line in /proc/meminfo is not good enough w/ Linux systems.

If there is any memory available, the kernel will take it for I/O caching. If another process requests more memory, the kernel will take it out of the chunk used for I/O caching.

Thus, in Linux systems, the memory available to applications is the memory reported to be free PLUS the memory used for I/O caching.

So if we add the two values:

grep ^MemFree /proc/meminfo | awk '{ print $2 }'

grep ^Cached /proc/meminfo | awk '{ print $2 }'

we get the effective memory available to applications (memory free).

Looks like this is the bit where the GC Agents gets confused.

What can we do?

- We could flush the I/O cache and monitor

- We could check with Oracle what the GC Agents is supposed to be reporting (and/or report a bug)

- Just monitor MemFree and Cached and wait for the next GC alert (monitoring should be done by TSG, they already are flooding our system logs w/ SNMP daemon messages)

- A combination of any of the above

HTTP Authentication

Despite working with JEE for years I have always rolled my own Authentication solution. In my current position I inherited a FORM based security solution.

This website give a nice overview.

http://onjava.com/pub/a/onjava/]2002/06/12/form.html

Of particular note:


Auth method is defined in the web.xml in the following section


<login-config>

<auth-method>FORM</auth-method>

<realm-name>myrealm</realm-name>

<form-login-config>

<form-login-page>/login.jsp</form-login-page>

<form-error-page>/fail_login.html</form-error-page>

</form-login-config>

</login-config>


The login form must contain fields for entering username and password.


These fields must be named

j_username and j_password, respectively.


This form should post these values to j_security_check logical name. (Should use ssl to ensure passwords are protected)


In our case we were using this on Welbogic, and using the default myrealm. This then uses the users/ groups and roles as defined in the weblogic domain.

Oracle JCA and OSB

This is a right design cock up on Oracles part if you ask me.

Lets create a new OSB proxy to communicate with a Database table. In past projects we took the custom approach or rolling our own DB adapter, by creating a JAR resource, and accessing it via the OSB proxy.

Architecturally however this approach would have been frowned upon, so in my current position, where I’m inheriting a system built by Oracle consultancy they take the ‘approved’ design approach of using TopLink to create a DB adapter, and communicating with this via JCA.

Unfortunately it is a bit of a disaster.

Roadblock one:

To create/ edit the TopLink adapter we must use JDevelopper.

To create/ edit OSB proxy we must use eclipse or the web front end.

This, oracle owned blog entry even sounds exasperated, but was a life-saver in terms of getting the system up and running.

JMS lifecycle and Searching Filters (Message Selector)

The JMS message lifecycle can be summarized as below with respect to the following two states:

1. A message sent by a JMS producer
* Without any associated transaction:
It is immediately current.
* Within a JTA transaction or transacted JMS session:
It remains pending until the transaction is committed or rolled back. If the transaction is committed, the message becomes available, and if the transaction is rolled back, the message is removed from the destination.
* With a specified TimeToDeliver property:
It remains pending until the TimeToDeliver period expires. This is expected because the purpose of the TimeToDeliver property is to keep the message unavailable to consumers until the specified time period.

2. A message received by a JMS consumer
* Without any associated JTA transaction and in a non-transacted JMS session with an acknowledgement mode of NO_ACKNOWLEDGE or MULTICAST_NO_ACKNOWLEDGE:
It is immediately removed from the destination.
* Within a JTA transaction or transacted JMS session:
It becomes pending until the transaction or session is committed or rolled back. On commit, the message is removed from the destination, and on rollback, the message becomes available again, unless a Redelivery delay is specified, in which case it continues to remain pending until the Redelivery delay.
* Within a non-transacted JMS session with the acknowledgement mode of AUTO_ACKNOWLEDGE, DUPS_OK_ACKNOWLEDGE, or CLIENT_ACKNOWLEDGE:
It becomes pending until the acknowledgement is received.

Message Selector

When searching for messages in JMS queue. There is an expression language for specifying parameters. This is known as Message Selector

This is the format. This is an example from weblogic that I was using.
JMSMessageId = 'ID:<982769.1294675696539.0>'

What confused me was the ID section. I assumed that was the variable name, so I was changing the format to ID=<> etc.

Note can also do logical operations, see http://download.oracle.com/javaee/6/api/javax/jms/Message.html
e.g.
"JMSType = 'car' AND color = 'blue' AND weight > 2500"

State String
The current state of a message, which could be one of DELAYED, EXPIRED, ORDERED, PAUSED, RECEIVE, REDELIVERY_COUNT_EXCEEDED, SEND, TRANSACTION, or VISIBLE.

Note with Weblogic Unit of Order
We were having problems. A standalone utility program is used to display Jms message see this post . It was having problems displaying some messages. One common theme appeared to be that they were in the Ordered state, as opposed to the visible state. This is due to the message using the Unit of Order to ensure ordered delivery.

Another blog post noted (http://forum.springsource.org/showthread.php?t=69398 )that if the message is in the Receive State, then it may be caused by having 2 Weblogic servers with the same name, where one weblogic server is consuming from a queue serverd from another Weblogic server.

Note also the Xml manipulation required on the JMSMessage







private static final Integer JMS_ALL_STATES = new Integer(0x7fffffff);

public JmsMessage[] getMEssages(){
try {
jmxCon = getJMXConnector(server, username, password);
jmxCon.connect();
MBeanServerConnection con = jmxCon.getMBeanServerConnection();
ObjectName destination = findQueueObjectName(con, jmsModuleName, queueName);
String newCursor = (String) con.invoke(destination, "getMessages", new Object[] { "JMSMessageID='"+msgId+"'", QueuePersistenceJmxImpl.BROWSE_TIMEOUT, JMS_ALL_STATES}, OP_GET_MESSAGES_ALL_SIGNATURE);
result.setCursor(newCursor);
// Should only be one value so no need to sort
//con.invoke(destination, "sort", new Object[] { newCursor, POSITION_UNDEFINED, DEFAULT_ORDERING_ATTRS, DEFAULT_ORDER }, OP_SORT_SIGNATURE);
result.setNumMessages((Long) con.invoke(destination, "getCursorSize", new Object[] { newCursor }, OP_GET_CURSOR_SIZE_SIGNATURE));
Long initialPosition =new Long(0);
Integer pageSize=new Integer(100);
data = (CompositeData[]) con.invoke(destination, "getItems", new Object[] { newCursor, initialPosition, pageSize }, OP_GET_ITEMS_SIGNATURE);
if (data != null) {
JmsMessage[] ret = toJmsMessages(data, new CursorParam(destination, con, newCursor));
if(ret!=null){
if(ret.length>1)
logger.warn("findMessage (by messageId) returned more than one result"+JavaUtil.toString(ret));
return ret[0];
}
return null;
}
} catch (Exception e) {
// throw new RuntimeException("Error while browsing messages",
// e);
// ignore the error and try the next node
logger.error("Error searching for message " + msgId + " on server " + server, e);
}

private static JMXConnector getJMXConnector(Server server, String username, String password) throws IOException {
String fullServerURL = "service:jmx:iiop://" + server.getUrl() + "/jndi/weblogic.management.mbeanservers.runtime";
JMXConnector jmxCon = null;

JMXServiceURL serviceUrl = new JMXServiceURL(fullServerURL);

Hashtable env = new Hashtable();
env.put(JMXConnectorFactory.PROTOCOL_PROVIDER_PACKAGES, "weblogic.management.remote");
env.put(javax.naming.Context.SECURITY_PRINCIPAL, username);
env.put(javax.naming.Context.SECURITY_CREDENTIALS, CryptoUtilWrapper.decrypt(password));

jmxCon = JMXConnectorFactory.newJMXConnector(serviceUrl, env);

return jmxCon;
}

private JmsMessage[] toJmsMessages(CompositeData[] data, CursorParam params) {
XPathUtils xpathUtils = null;
Document d;
JmsMessage[] ret = new JmsMessage[data.length];
int i=0;
MBeanServerConnection con = params.getConnection();
ObjectName destination = params.getDestination();
try {
for (CompositeData resItem : data) {
JmsMessage msg = new JmsMessage();
d = XMLUtils.stringToDocument((String) resItem.get("MessageXMLText"));
xpathUtils = new XPathUtils(d, QueuePersistenceJmxImpl.JMS_MESSAGE_NS_CTX);
xpathUtils.setDocument(d);
msg.setId(xpathUtils.findValue("/mes:WLJMSMessage/mes:Header/mes:JMSMessageID"));
msg.setTimestamp(new Date(Long.parseLong(xpathUtils.findValue("/mes:WLJMSMessage/mes:Header/mes:JMSTimestamp"))));
// Get BOdy
CompositeData body = (CompositeData) con.invoke(destination, "getMessage", new Object[] { params.getCursor(), msg.getId() }, OP_CURSOR_GET_MESSAGE_SIGNATURE);
d = XMLUtils.stringToDocument((String)body.get("MessageXMLText"));
xpathUtils = new XPathUtils(d, QueuePersistenceJmxImpl.JMS_MESSAGE_NS_CTX);
xpathUtils.setDocument(d);
msg.setBody(xpathUtils.findValue("/mes:WLJMSMessage/mes:Body/mes:Text"));
ret[i++]=msg;
}
return ret;
} catch (Exception e) {
throw new RuntimeException("Error while transforming CompositeData to JmsMessage", e);
}
}

Oracle partitioning table

Just a note on how we partitioned a large table (Transaction) in Oracle into monthly partitions/. It involves creating a swap table to store existing data, then placing it into seperate partitions.


Just a note on how we partitioned a large table (Transaction) in Oracle into monthly partitions/.

DROP TABLE TRANSACTION_SWAP
/
ALTER TABLE TRANSACTION DROP CONSTRAINT TRANSACTION_PK
/
DROP INDEX TRANSACTION_IDX1
/
DROP INDEX TRANSACTION_IDX2
/
RENAME TRANSACTION TO TRANSACTION_SWAP
/
REM
REM Now create new version of table, partitioned and with a different primary key
REM

CREATE TABLE TRANSACTION (
"ID" NUMBER NOT NULL ENABLE,
"ORIGINATOR" VARCHAR2(4 BYTE),
"ORIGINATOR_INSTANCE_ID" VARCHAR2(100 BYTE),
"MESSAGE_REF" VARCHAR2(512 BYTE),
"ENTITY_REF" VARCHAR2(512 BYTE),
"TIME" TIMESTAMP(6),
"STATUS" VARCHAR2(5 BYTE),
"JMS_MESSAGE_ID" VARCHAR2(30 BYTE),
"PROJECT_NAME" VARCHAR2(384 BYTE),
"OPERATION" VARCHAR2(150 BYTE),
"INTERFACE" VARCHAR2(20 BYTE))
Partition By Range(Time)
(Partition P_2010_10_31 Values Less Than (To_Timestamp('2010-11-01','YYYY-MM-DD')),
Partition P_2010_11_30 Values Less Than (To_Timestamp('2010-12-01','YYYY-MM-DD')),
Partition P_2010_12_31 Values Less Than (To_Timestamp('2011-01-01','YYYY-MM-DD')),
Partition P_2011_01_31 Values Less Than (To_Timestamp('2011-02-01','YYYY-MM-DD')),
Partition P_High Values Less Than(Maxvalue));
/*
REM
REM Copy over all data before creating indexes - for efficiency reasons
REM Note in some environments, the time can be null on some entries in the table.
REM
REM These rows will be placed in a distinct table
REM

INSERT INTO TRANSACTION
(ID
,ORIGINATOR
,ORIGINATOR_INSTANCE_ID
,MESSAGE_REF
,ENTITY_REF
,TIME
,STATUS
,JMS_MESSAGE_ID
,PROJECT_NAME
,OPERATION
,INTERFACE)
SELECT
ID
,ORIGINATOR
,ORIGINATOR_INSTANCE_ID
,MESSAGE_REF
,ENTITY_REF
,TIME
,STATUS
,JMS_MESSAGE_ID
,PROJECT_NAME
,OPERATION
,INTERFACE
FROM TRANSACTION_SWAP
WHERE TIME IS NOT NULL
/
DROP TABLE TRANSACTION_ARC_NOTIME
/
CREATE TABLE TRANSACTION_ARC_NOTIME
AS
SELECT *
FROM TRANSACTION_SWAP
WHERE TIME IS NULL
/
/*
REM
REM Now create indexes
REM
*/
rem ALTER TABLE TRANSACTION ADD (CONSTRAINT "TRANSACTION_PK" PRIMARY KEY (TIME,ID));
CREATE UNIQUE INDEX TRANSACTION_UID1 ON TRANSACTION(TIME,ID) LOCAL;

ALTER TABLE TRANSACTION ADD (CONSTRAINT TRANSACTION_PK PRIMARY KEY (TIME, ID) USING INDEX TRANSACTION_UID1);

CREATE INDEX TRANSACTION_IDX1 On TRANSACTION (ORIGINATOR_INSTANCE_ID, STATUS) LOCAL;
REM
REM And re-create a grant which is present on production. This may fail elsewhere
REM
GRANT SELECT ON TRANSACTION TO READONLY
/

REM
REM Create the archive table at the same time
REM
CREATE TABLE TRANSACTION_ARC (
"ID" NUMBER NOT NULL ENABLE,
"ORIGINATOR" VARCHAR2(4 BYTE),
"ORIGINATOR_INSTANCE_ID" VARCHAR2(100 BYTE),
"MESSAGE_REF" VARCHAR2(512 BYTE),
"ENTITY_REF" VARCHAR2(512 BYTE),
"TIME" TIMESTAMP(6),
"STATUS" VARCHAR2(5 BYTE),
"JMS_MESSAGE_ID" VARCHAR2(30 BYTE),
"PROJECT_NAME" VARCHAR2(384 BYTE),
"OPERATION" VARCHAR2(150 BYTE),
"INTERFACE" VARCHAR2(20 BYTE))
Partition By Range(Time)
(Partition P_2010_10_31 Values Less Than (To_Timestamp('2010-11-01','YYYY-MM-DD')),
Partition P_2010_11_30 Values Less Than (To_Timestamp('2010-12-01','YYYY-MM-DD')),
Partition P_2010_12_31 Values Less Than (To_Timestamp('2011-01-01','YYYY-MM-DD')),
Partition P_2011_01_31 Values Less Than (To_Timestamp('2011-02-01','YYYY-MM-DD')),
Partition P_High Values Less Than(Maxvalue));

REM
REM recreate any package body with a new version which defaults the time in the table,
REM to make sure there is always a value there
REM
@TrackingProcedureBody

PROMPT *** OPERATION COMPLETE ***
PROMPT
PROMPT *** WHEN HAPPY WITH RESULTS, TRUNCATE THE TABLE TRANSACTION_SWAP